• Publications
  • Influence
Gradient-based learning applied to document recognition
This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task, and Convolutional neural networks are shown to outperform all other techniques. Expand
Natural Language Processing (Almost) from Scratch
We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including part-of-speech tagging, chunking, named entityExpand
Wasserstein Generative Adversarial Networks
This work introduces a new algorithm named WGAN, an alternative to traditional GAN training that can improve the stability of learning, get rid of problems like mode collapse, and provide meaningful learning curves useful for debugging and hyperparameter searches. Expand
Large-Scale Machine Learning with Stochastic Gradient Descent
A more precise analysis uncovers qualitatively different tradeoffs for the case of small-scale and large-scale learning problems. Expand
Optimization Methods for Large-Scale Machine Learning
A major theme of this study is that large-scale machine learning represents a distinctive setting in which the stochastic gradient method has traditionally played a central role while conventional gradient-based nonlinear optimization techniques typically falter, leading to a discussion about the next generation of optimization methods for large- scale machine learning. Expand
Efficient BackProp
Towards Principled Methods for Training Generative Adversarial Networks
The goal of this paper is to make theoretical steps towards fully understanding the training dynamics of generative adversarial networks, and performs targeted experiments to substantiate the theoretical analysis and verify assumptions, illustrate claims, and quantify the phenomena. Expand
Signature Verification Using A "Siamese" Time Delay Neural Network
An algorithm for verification of signatures written on a pen-input tablet based on a novel, artificial neural network called a "Siamese" neural network, which consists of two identical sub-networks joined at their outputs. Expand
Learning methods for generic object recognition with invariance to pose and lighting
A real-time version of the system was implemented that can detect and classify objects in natural scenes at around 10 frames per second and proved impractical, while convolutional nets yielded 16/7% error. Expand
Stochastic Gradient Descent Tricks
  • L. Bottou
  • Computer Science
  • Neural Networks: Tricks of the Trade
  • 2012
This chapter provides background material, explains why SGD is a good learning algorithm when the training set is large, and provides useful recommendations. Expand