Learning with Hierarchical-Deep Models

  title={Learning with Hierarchical-Deep Models},
  author={Ruslan Salakhutdinov and Joshua B. Tenenbaum and Antonio Torralba},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
We introduce HD (or “Hierarchical-Deep”) models, a new compositional learning architecture that integrates deep learning models with structured hierarchical Bayesian (HB) models. Specifically, we show how we can learn a hierarchical Dirichlet process (HDP) prior over the activities of the top-level features in a deep Boltzmann machine (DBM). This compound HDP-DBM model learns to learn novel concepts from very few training example by learning low-level generic features, high-level features that… 

Learning to Learn Visual Object Categories by Integrating Deep Learning with Hierarchical Bayes

This work proposes a model that combines powerful features extracted from a deep neural network with a semantic structure inferred using probabilistic Hierarchical Bayes, and test and demonstrate the capabilities of this model in three different tasks.

One-Shot Generalization in Deep Generative Models

New deep generative models are developed, models that combine the representational power of deep learning with the inferential power of Bayesian reasoning, and are able to generate compelling and diverse samples, providing an important class of general-purpose models for one-shot machine learning.

Zero-Shot Learning via Class-Conditioned Deep Generative Models

A deep generative model for Zero-Shot Learning that represents each seen/unseen class using a class-specific latent-space distribution, conditioned on class attributes, which facilitates learning highly discriminative feature representations for the inputs.

One-shot learning by inverting a compositional causal process

A Hierarchical Bayesian model based on com-positionality and causality that can learn a wide range of natural (although simple) visual concepts, generalizing in human-like ways from just one image.

Few-shot Generative Modelling with Generative Matching Networks

This work develops a new generative model called Generative Matching Network which is inspired by the recently proposed matching networks for one-shot learning in discriminative tasks and can instantly learn new concepts that were not available in the training data but conform to a similar generative process.

Deep Boltzmann Machines based vehicle recognition

This work proposes to learn manifold hierarchical features where the high-level discriminative features are obtained by capturing correlations among the learned low-level generic features via a deep learning model, called Deep Boltzmann Machines (DBM), a powerful hierarchical generative model for feature learning.

Combining Deep Universal Features, Semantic Attributes, and Hierarchical Classification for Zero-Shot Learning

This work addresses zero-shot (ZS) learning, building upon prior work in hierarchical classification by combining it with approaches based on semantic attribute estimation, and shows that using input posteriorsbased on semantic attributes improves the expected reward for novel classes.

Learning Robust Visual-Semantic Embeddings

An end-to-end learning framework that is able to extract more robust multi-modal representations across domains and a novel technique of unsupervised-data adaptation inference is introduced to construct more comprehensive embeddings for both labeled and unlabeled data.

Reconciling meta-learning and continual learning with online mixtures of tasks

This work uses the connection between gradient-based meta-learning and hierarchical Bayes to propose a Dirichlet process mixture of hierarchical Bayesian models over the parameters of an arbitrary parametric model such as a neural network.



Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations

The convolutional deep belief network is presented, a hierarchical generative model which scales to realistic image sizes and is translation-invariant and supports efficient bottom-up and top-down probabilistic inference.

A Fast Learning Algorithm for Deep Belief Nets

A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.

Sparse Feature Learning for Deep Belief Networks

This work proposes a simple criterion to compare and select different unsupervised machines based on the trade-off between the reconstruction error and the information content of the representation, and describes a novel and efficient algorithm to learn sparse representations.

Exploring Strategies for Training Deep Neural Networks

These experiments confirm the hypothesis that the greedy layer-wise unsupervised training strategy helps the optimization by initializing weights in a region near a good local minimum, but also implicitly acts as a sort of regularization that brings better generalization and encourages internal distributed representations that are high-level abstractions of the input.

One shot learning of simple visual concepts

A generative model of how characters are composed from strokes is introduced, where knowledge from previous characters helps to infer the latent strokes in novel characters, using a massive new dataset of handwritten characters.

Learning Multiple Layers of Features from Tiny Images

It is shown how to train a multi-layer generative model that learns to extract meaningful features which resemble those found in the human visual cortex, using a novel parallelization algorithm to distribute the work among multiple machines connected on a network.

Acoustic Modeling Using Deep Belief Networks

It is shown that better phone recognition on the TIMIT dataset can be achieved by replacing Gaussian mixture models by deep neural networks that contain many layers of features and a very large number of parameters.

Implicit Mixtures of Restricted Boltzmann Machines

Results for the MNIST and NORB datasets are presented showing that the implicit mixture of RBMs learns clusters that reflect the class structure in the data.

Extracting and composing robust features with denoising autoencoders

This work introduces and motivate a new training principle for unsupervised learning of a representation based on the idea of making the learned representations robust to partial corruption of the input pattern.

Modeling Transfer Learning in Human Categorization with the Hierarchical Dirichlet Process

This work proposes an explanation for transfer learning effects in human categorization by implementing a model from the statistical machine learning literature – the hierarchical Dirichlet process (HDP) – to make empirical evaluations of its ability to explain these effects.