• Publications
  • Influence
Distributed Representations of Words and Phrases and their Compositionality
TLDR
This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling. Expand
Efficient Estimation of Word Representations in Vector Space
TLDR
Two novel model architectures for computing continuous vector representations of words from very large data sets are proposed and it is shown that these vectors provide state-of-the-art performance on the authors' test set for measuring syntactic and semantic word similarities. Expand
MapReduce: Simplified Data Processing on Large Clusters
TLDR
This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable. Expand
TensorFlow: A system for large-scale machine learning
TLDR
The TensorFlow dataflow model is described and the compelling performance that Tensor Flow achieves for several real-world applications is demonstrated. Expand
Distilling the Knowledge in a Neural Network
TLDR
This work shows that it can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model and introduces a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse. Expand
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
TLDR
The TensorFlow interface and an implementation of that interface that is built at Google are described, which has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields. Expand
Bigtable: A Distributed Storage System for Structured Data
TLDR
The simple data model provided by Bigtable is described, which gives clients dynamic control over data layout and format, and the design and implementation of Bigtable are described. Expand
In-datacenter performance analysis of a tensor processing unit
TLDR
This paper evaluates a custom ASIC-called a Tensor Processing Unit (TPU)-deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN) and compares it to a server-class Intel Haswell CPU and an Nvidia K80 GPU, which are contemporaries deployed in the samedatacenters. Expand
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
TLDR
GNMT, Google's Neural Machine Translation system, is presented, which attempts to address many of the weaknesses of conventional phrase-based translation systems and provides a good balance between the flexibility of "character"-delimited models and the efficiency of "word"-delicited models. Expand
Large Scale Distributed Deep Networks
TLDR
This paper considers the problem of training a deep network with billions of parameters using tens of thousands of CPU cores and develops two algorithms for large-scale distributed training, Downpour SGD and Sandblaster L-BFGS, which increase the scale and speed of deep network training. Expand
...
1
2
3
4
5
...