• Publications
  • Influence
Attention is All you Need
TLDR
A new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely is proposed, which generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data. Expand
Natural Questions: A Benchmark for Question Answering Research
TLDR
The Natural Questions corpus, a question answering data set, is presented, introducing robust metrics for the purposes of evaluating question answering systems; demonstrating high human upper bounds on these metrics; and establishing baseline results using competitive methods drawn from related literature. Expand
WikiReading: A Novel Large-scale Language Understanding Task over Wikipedia
TLDR
This work presents WIKIREADING, a large-scale natural language understanding task and publicly-available dataset with 18 million instances, and compares various state-of-the-art DNNbased architectures for document classification, information extraction, and question answering. Expand
Coarse-to-Fine Question Answering for Long Documents
TLDR
A framework for question answering that can efficiently scale to longer documents while maintaining or even improving performance of state-of-the-art models is presented and sentence selection is treated as a latent variable trained jointly from the answer only using reinforcement learning. Expand
Improving Neural Program Synthesis with Inferred Execution Traces
TLDR
This work splits the process into two parts: infer the trace from the input/output example, then infer the program from the trace, which leads to state-of-the-art results in program synthesis in the Karel domain. Expand
Neural Program Search: Solving Programming Tasks from Description and Examples
TLDR
A Neural Program Search, an algorithm to generate programs from natural language description and a small number of input/output examples that significantly outperforms a sequence-to-sequence model with attention baseline is presented. Expand
TensorFlow Estimators: Managing Simplicity vs. Flexibility in High-Level Machine Learning Frameworks
TLDR
To make out of the box models flexible and usable across a wide range of problems, these canned Estimators are parameterized not only over traditional hyperparameters, but also using feature columns, a declarative specification describing how to interpret input data. Expand
NAPS: Natural Program Synthesis Dataset
TLDR
A program synthesis-oriented dataset consisting of human written problem statements and solutions for these problems collected via crowdsourcing and extracted from human-written solutions in programming competitions, accompanied by input/output examples is presented. Expand
Neural Program Search: Solving Data Processing Tasks from Description and Examples
TLDR
A Neural Program Search, an algorithm to generate programs from natural language description and a small number of input / output examples that significantly outperforms sequence-to-sequence model with attention baseline is presented. Expand
...
1
2
...