• Publications
  • Influence
Show and tell: A neural image caption generator
Automatically describing the content of an image is a fundamental problem in artificial intelligence that connects computer vision and natural language processing. In this paper, we present aExpand
  • 3,328
  • 463
Adversarial examples in the physical world
Most existing machine learning classifiers are highly vulnerable to adversarial examples. An adversarial example is a sample of input data which has been modified very slightly in a way that isExpand
  • 1,822
  • 350
Adversarial Machine Learning at Scale
Adversarial examples are malicious inputs designed to fool machine learning models. They often transfer from one model to another, allowing attackers to mount black box attacks without knowledge ofExpand
  • 1,034
  • 246
Density estimation using Real NVP
Unsupervised learning of probabilistic models is a central yet challenging problem in machine learning. Specifically, designing models with tractable learning, sampling, inference and evaluation isExpand
  • 875
  • 221
Generating Sentences from a Continuous Space
The standard recurrent neural network language model (RNNLM) generates sentences one word at a time and does not work from an explicit global sentence representation. In this work, we introduce andExpand
  • 1,089
  • 215
Understanding deep learning requires rethinking generalization
Despite their massive size, successful deep artificial neural networks can exhibit a remarkably small difference between training and test performance. Conventional wisdom attributes smallExpand
  • 1,984
  • 202
DeViSE: A Deep Visual-Semantic Embedding Model
Modern visual recognition systems are often limited in their ability to scale to large numbers of object categories. This limitation is in part due to the increasing difficulty of acquiringExpand
  • 1,384
  • 179
Tacotron: Towards End-to-End Speech Synthesis
A text-to-speech synthesis system typically consists of multiple stages, such as a text analysis frontend, an acoustic model and an audio synthesis module. Building these components often requiresExpand
  • 501
  • 115
Zero-Shot Learning by Convex Combination of Semantic Embeddings
Abstract: Several recent publications have proposed methods for mapping images into continuous semantic embedding spaces. In some cases the embedding space is trained jointly with the imageExpand
  • 555
  • 111
Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks
Recurrent Neural Networks can be trained to produce sequences of tokens given some input, as exemplified by recent results in machine translation and image captioning. The current approach toExpand
  • 894
  • 100