• Publications
  • Influence
DeepFace: Closing the Gap to Human-Level Performance in Face Verification
TLDR
We present a system (DeepFace) that has closed the majority of the remaining gap in the most popular benchmark in unconstrained face recognition, and is now at the brink of human level accuracy. Expand
  • 4,211
  • 303
  • PDF
Large Scale Distributed Deep Networks
TLDR
We have developed a software framework called DistBelief that can utilize computing clusters with thousands of machines to train large models. Within this framework, we have developed two algorithms for large-scale distributed training: (i) Downpour SGD, an asynchronous stochastic gradient descent procedure supporting a large number of model replicas. Expand
  • 2,503
  • 264
  • PDF
Word Translation Without Parallel Data
TLDR
We show that we can build a bilingual dictionary between two languages without using parallel corpora, by aligning monolingual word embedding spaces in an unsupervised way. Expand
  • 781
  • 236
  • PDF
DeViSE: A Deep Visual-Semantic Embedding Model
TLDR
We present a new deep visual-semantic embedding model trained to identify visual objects using both labeled image data as well as semantic information gleaned from unannotated text. Expand
  • 1,506
  • 185
  • PDF
Sequence Level Training with Recurrent Neural Networks
TLDR
We propose a novel sequence level training algorithm that directly optimizes the metric used at test time, such as BLEU or ROUGE. Expand
  • 923
  • 144
  • PDF
What is the best multi-stage architecture for object recognition?
In many recent object recognition systems, feature extraction stages are generally composed of a filter bank, a non-linear transformation, and some sort of feature pooling layer. Most systems useExpand
  • 1,828
  • 122
  • PDF
Gradient Episodic Memory for Continual Learning
TLDR
We propose a set of metrics to evaluate models learning over a continuum of data. Expand
  • 518
  • 101
  • PDF
Building high-level features using large scale unsupervised learning
TLDR
We consider the problem of building high-level, class-specific feature detectors from only unlabeled data. Expand
  • 1,935
  • 99
  • PDF
Unsupervised Machine Translation Using Monolingual Corpora Only
TLDR
We propose a model that takes sentences from monolingual corpora in two languages and maps them into the same latent space, the model effectively learns to translate without using any labeled data. Expand
  • 565
  • 95
  • PDF
Phrase-Based & Neural Unsupervised Machine Translation
TLDR
We propose two model variants, a neural and a phrase-based model. Expand
  • 370
  • 93
  • PDF