• Publications
  • Influence
Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion
TLDR
This work clearly establishes the value of using a denoising criterion as a tractable unsupervised objective to guide the learning of useful higher level representations. Expand
Extracting and composing robust features with denoising autoencoders
TLDR
This work introduces and motivate a new training principle for unsupervised learning of a representation based on the idea of making the learned representations robust to partial corruption of the input pattern. Expand
Theano: A Python framework for fast computation of mathematical expressions
TLDR
The performance of Theano is compared against Torch7 and TensorFlow on several machine learning models and recently-introduced functionalities and improvements are discussed. Expand
Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples
TLDR
This work proposes Meta-Dataset: a new benchmark for training and evaluating models that is large-scale, consists of diverse datasets, and presents more realistic tasks, and proposes a new set of baselines for quantifying the benefit of meta-learning in Meta- Dataset. Expand
The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training
TLDR
The experiments confirm and clarify the advantage of unsupervised pre- training, and empirically show the influence of pre-training with respect to architecture depth, model capacity, and number of training examples. Expand
Topmoumoute Online Natural Gradient Algorithm
TLDR
An efficient, general, online approximation to the natural gradient descent which is suited to large scale problems and much faster convergence in computation time and in number of iterations with TONGA than with stochastic gradient descent, even on very large datasets. Expand
Negative eigenvalues of the Hessian in deep neural networks
TLDR
This work studies the loss landscape of deep networks through the eigendecompositions of their Hessian matrix and examines how important the negative eigenvalues are and the benefits one can observe in handling them appropriately. Expand
On the Use of Sparce Time Relative Auditory Codes for Music
TLDR
This paper investigates the use of audio features based on realistic music examples and their use as input features for supervised learning and identifies three specific issues related to these features which will need to be further addressed in order to obtain the full benefit for MIR applications. Expand
Reducing the variance in online optimization by transporting past gradients
TLDR
The idea of implicit gradient transport (IGT) which transforms gradients computed at previous iterates into gradients evaluated at the current iterate without using the Hessian explicitly is proposed which yields the optimal asymptotic convergence rate for online stochastic optimization in the restricted setting where the Hessians of all component functions are equal. Expand
Learning to Fix Build Errors with Graph2Diff Neural Networks
TLDR
This work presents a new deep learning architecture, called Graph2Diff, for automatically localizing and fixing build errors, which represents source code, build configuration files, and compiler diagnostic messages as a graph, and uses a Graph Neural Network model to predict a diff. Expand
...
1
2
...