• Publications
  • Influence
Distributed Representations of Words and Phrases and their Compositionality
We present a simple method for finding phrases in text, and show that learning good vector representations for millions of phrases is possible. Expand
  • 18,433
  • 2803
Efficient Estimation of Word Representations in Vector Space
We propose two novel model architectures for computing continuous vector representations of words from very large data sets. Expand
  • 14,780
  • 2564
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
This paper describes the TensorFlow interface for expressing machine learning algorithms, and an implementation of that interface that we have built at Google. Expand
  • 7,410
  • 862
Large Scale Distributed Deep Networks
We have developed a software framework called DistBelief that can utilize computing clusters with thousands of machines to train large models. Within this framework, we have developed two algorithms for large-scale distributed training: (i) Downpour SGD, an asynchronous stochastic gradient descent procedure supporting a large number of model replicas. Expand
  • 2,338
  • 256
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Neural Machine Translation (NMT) is an end-to-end learning approach for automated translation, with the potential to overcome many of the weaknesses of conventional phrase-based translation systems. Expand
  • 2,746
  • 224
DeViSE: A Deep Visual-Semantic Embedding Model
We present a new deep visual-semantic embedding model trained to identify visual objects using both labeled image data as well as semantic information gleaned from unannotated text. Expand
  • 1,388
  • 180
Wide & Deep Learning for Recommender Systems
In this paper, we present Wide & Deep learning---jointly trained wide linear models and deep neural networks---to combine the benefits of memorization and generalization for recommender systems. Expand
  • 930
  • 136
Zero-Shot Learning by Convex Combination of Semantic Embeddings
We propose a simple method for constructing an image embedding system from any existing \nway{} image classifier and a semantic word embedding model, which contains the $\n$ class labels in its vocabulary. Expand
  • 558
  • 110
Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation
We propose a simple solution to use a single Neural Machine Translation (NMT) model to translate between multiple languages using a single model, taking advantage of multilingual data to improve NMT. Expand
  • 808
  • 98
Building high-level features using large scale unsupervised learning
We consider the problem of building high-level, class-specific feature detectors from only unlabeled data. Expand
  • 1,860
  • 90