• Publications
  • Influence
Visualizing and Understanding Convolutional Networks
A novel visualization technique is introduced that gives insight into the function of intermediate feature layers and the operation of the classifier in large Convolutional Network models, used in a diagnostic role to find model architectures that outperform Krizhevsky et al on the ImageNet classification benchmark. Expand
ADADELTA: An Adaptive Learning Rate Method
We present a novel per-dimension learning rate method for gradient descent called ADADELTA. The method dynamically adapts over time using only first order information and has minimal computationalExpand
Regularization of Neural Networks using DropConnect
This work introduces DropConnect, a generalization of Dropout, for regularizing large fully-connected layers within neural networks, and derives a bound on the generalization performance of both Dropout and DropConnect. Expand
Deconvolutional networks
This work presents a learning framework where features that capture these mid-level cues spontaneously emerge from image data, based on the convolutional decomposition of images under a spar-sity constraint and is totally unsupervised. Expand
Stochastic Pooling for Regularization of Deep Convolutional Neural Networks
We introduce a simple and effective method for regularizing large convolutional neural networks. We replace the conventional deterministic pooling operations with a stochastic procedure, randomlyExpand
Adaptive deconvolutional networks for mid and high level feature learning
A hierarchical model that learns image decompositions via alternating layers of convolutional sparse coding and max pooling, relying on a novel inference scheme that ensures each layer reconstructs the input, rather than just the output of the layer directly beneath, as is common with existing hierarchical approaches. Expand
On rectified linear units for speech processing
This work shows that it can improve generalization and make training of deep networks faster and simpler by substituting the logistic units with rectified linear units. Expand
Finding Task-Relevant Features for Few-Shot Learning by Category Traversal
A Category Traversal Module is introduced that can be inserted as a plug-and-play module into most metric-learning based few-shot learners, identifying task-relevant features based on both intra-class commonality and inter-class uniqueness in the feature space. Expand
Facial Expression Transfer with Input-Output Temporal Restricted Boltzmann Machines
We present a type of Temporal Restricted Boltzmann Machine that defines a probability distribution over an output sequence conditional on an input sequence. It shares the desirable properties ofExpand