Scoring and Classifying with Gated Auto-Encoders

  title={Scoring and Classifying with Gated Auto-Encoders},
  author={D. Im and Graham W. Taylor},
Auto-encoders are perhaps the best-known non-probabilistic methods for representation learning. [...] Key Result On a set of deep learning benchmarks, we also demonstrate their effectiveness for single and multi-label classification.Expand
Modeling Musical Structure with Artificial Neural Networks
This thesis explores the application of ANNs to different aspects of musical structure modeling, identify some challenges involved and propose strategies to address them, and motivates the relevance of musical transformations in structure modeling and shows how a connectionist model, the Gated Autoencoder, can be employed to learn transformations between musical fragments. Expand
Gate-Layer Autoencoders with Application to Incomplete EEG Signal Recovery
This paper proposes a new AE architecture: Gate-Layer AE (GLAE), which affords it with an inherent ability to recover missing variables from the available ones and to act as a concurrent multi-function approximator. Expand


On autoencoder scoring
This paper shows how an autoencoder can assign meaningful scores to data independently of training procedure and without reference to any probabilistic model, by interpreting it as a dynamical system and how one can combine multiple, unnormalized scores into a generative classifier. Expand
Gated Autoencoders with Tied Input Weights
The semantic interpretation of images is one of the core applications of deep learning. Several techniques have been recently proposed to model the relation between two images, with application toExpand
A Connection Between Score Matching and Denoising Autoencoders
A proper probabilistic model for the denoising autoencoder technique is defined, which makes it in principle possible to sample from them or rank examples by their energy, and a different way to apply score matching that is related to learning to denoise and does not require computing second derivatives is suggested. Expand
What regularized auto-encoders learn from the data-generating distribution
It is shown that the auto-encoder captures the score (derivative of the log-density with respect to the input) and contradicts previous interpretations of reconstruction error as an energy function. Expand
Extracting and composing robust features with denoising autoencoders
This work introduces and motivate a new training principle for unsupervised learning of a representation based on the idea of making the learned representations robust to partial corruption of the input pattern. Expand
Contractive Auto-Encoders: Explicit Invariance During Feature Extraction
It is found empirically that this penalty helps to carve a representation that better captures the local directions of variation dictated by the data, corresponding to a lower-dimensional non-linear manifold, while being more invariant to the vast majority of directions orthogonal to the manifold. Expand
The Potential Energy of an Autoencoder
It is shown how most common autoencoders are naturally associated with an energy function, independent of the training procedure, and that the energy landscape can be inferred analytically by integrating the reconstruction function of the autoencoder. Expand
Conditional Restricted Boltzmann Machines for Structured Output Prediction
This work argues that standard Contrastive Divergence-based learning may not be suitable for training CRBMs, and proposes an improved learning algorithm for two distinct types of structured output prediction problems and shows that the new learning algorithms can work much better than Contrastives Divergence on both types of problems. Expand
Deep Generative Stochastic Networks Trainable by Backprop
Theorems that generalize recent work on the probabilistic interpretation of denoising autoencoders are provided and obtain along the way an interesting justification for dependency networks and generalized pseudolikelihood. Expand
Gated Softmax Classification
A fully probabilistic model that computes class probabilities by combining an input vector multiplicatively with a vector of binary latent variables is described, and it is shown that this model can achieve classification performance that is competitive with (kernel) SVMs, backpropagation, and deep belief nets. Expand