Name Autoencoder
Intent
Train , via unsupervised learning, a model that is able to recreate the original input representation.
Motivation
How can a train model without requiring the training data to be labeled?
Sketch
<Diagram>
Discussion
Known Uses
Variational Autoencoder
Related Patterns
<Diagram>
References
http://arxiv.org/abs/1206.5533
http://www.deeplearningbook.org/contents/autoencoders.html
http://www.deeplearningbook.org/contents/representation.html 15.1 Greedy Layer-Wise Unsupervised Pretraining Unsupervised learning played a key historical role in the revival of deep neural networks, allowing for the first time to train a deep supervised network without requiring architectural specializations like convolution or recurrence. We call this procedure unsupervised pretraining, or more precisely, greedy layer-wise unsuper-vised pretraining. This procedure is a canonical example of how a representation learned for one task (unsupervised learning, trying to capture the shape of theinput distribution) can sometimes be useful for another task (supervised learning with the same input domain
http://arxiv.org/pdf/1603.06653v1.pdf Information Theoretic-Learning Auto-Encoder
Information-theoretic learning (ITL) is a field at the intersection of machine learning and information theory which encompasses a family of algorithms that compute and optimize informationtheoretic descriptors such as entropy, divergence, and mutual information. ITL objectives are computed directly from samples (non-parametrically) using Parzen windowing and Renyi’s entropy.
https://ayearofai.com/lenny-2-autoencoders-and-word-embeddings-oh-my-576403b0113a#.7n4vk8us5
http://arxiv.org/pdf/1606.04934v1.pdf Improving Variational Inference with Inverse Autoregressive Flow
https://arxiv.org/pdf/1611.09842v1.pdf Split-Brain Autoencoders: Unsupervised Learning by Cross-Channel Prediction
The method adds a split to the network, resulting in two disjoint sub-networks. Each sub-network is trained to perform a difficult task – predicting one subset of the data channels from another. Together, the sub-networks extract features from the entire input signal. By forcing the network to solve crosschannel prediction tasks, we induce a representation within the network which transfers well to other, unseen tasks.
The proposed method solves some of the weaknesses of previous self-supervised methods. Specifi- cally, the method (i) does not require a representational bottleneck for training, (ii) uses input dropout to help force abstraction in the representation, and (iii) is pre-trained on full images, and thus able to extract features from the full input data.
https://arxiv.org/abs/1506.02351v8 Stacked What-Where Auto-encoders
We present a novel architecture, the “stacked what-where auto-encoders” (SWWAE), which integrates discriminative and generative pathways and provides a unified approach to supervised, semi-supervised and unsupervised learning without relying on sampling during training.
The overall system, which can be seen as pairing a Convnet with a Deconvnet, yields good accuracy on a variety of semi-supervised and supervised tasks.
https://thecuriousaicompany.com/connection-to-g/ Learning by Denoising Part 1: What and why of denoising
https://arxiv.org/pdf/1710.10368v1.pdf DEEP GENERATIVE DUAL MEMORY NETWORK FOR CONTINUAL LEARNING
https://arxiv.org/abs/1712.07788v2 Deep Unsupervised Clustering Using Mixture of Autoencoders
https://openreview.net/forum?id=HkL7n1-0b Wasserstein Auto-Encoders
https://arxiv.org/abs/1805.09804v1 Implicit Autoencoders
In this paper, we describe the “implicit autoencoder” (IAE), a generative autoencoder in which both the generative path and the recognition path are parametrized by implicit distributions. We use two generative adversarial networks to define the reconstruction and the regularization cost functions of the implicit autoencoder, and derive the learning rules based on maximum-likelihood learning. Using implicit distributions allows us to learn more expressive posterior and conditional likelihood distributions for the autoencoder. Learning an expressive conditional likelihood distribution enables the latent code to only capture the abstract and high-level information of the data, while the remaining information is captured by the implicit conditional likelihood distribution. For example, we show that implicit autoencoders can disentangle the global and local information, and perform deterministic or stochastic reconstructions of the images. We further show that implicit autoencoders can disentangle discrete underlying factors of variation from the continuous factors in an unsupervised fashion, and perform clustering and semi-supervised learning.
https://arxiv.org/abs/1806.08462v1 Probabilistic Natural Language Generation with Wasserstein Autoencoders
https://colinraffel.com/publications/arxiv2018understanding.pdf https://colinraffel.com/talks/vector2018few.pdf