Attention is All you Need
- Ashish Vaswani, Noam M. Shazeer, Illia Polosukhin
- Computer ScienceNIPS
- 12 June 2017
A new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely is proposed, which generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
Natural Questions: A Benchmark for Question Answering Research
- T. Kwiatkowski, Jennimaria Palomaki, Slav Petrov
- Computer ScienceInternational Conference on Topology, Algebra and…
- 1 August 2019
The Natural Questions corpus, a question answering data set, is presented, introducing robust metrics for the purposes of evaluating question answering systems; demonstrating high human upper bounds on these metrics; and establishing baseline results using competitive methods drawn from related literature.
WikiReading: A Novel Large-scale Language Understanding Task over Wikipedia
- D. Hewlett, Alexandre Lacoste, David Berthelot
- Computer ScienceAnnual Meeting of the Association for…
- 1 August 2016
This work presents WIKIREADING, a large-scale natural language understanding task and publicly-available dataset with 18 million instances, and compares various state-of-the-art DNNbased architectures for document classification, information extraction, and question answering.
Coarse-to-Fine Question Answering for Long Documents
- Eunsol Choi, D. Hewlett, Jakob Uszkoreit, Illia Polosukhin, Alexandre Lacoste, Jonathan Berant
- Computer ScienceAnnual Meeting of the Association for…
- 6 November 2016
A framework for question answering that can efficiently scale to longer documents while maintaining or even improving performance of state-of-the-art models is presented and sentence selection is treated as a latent variable trained jointly from the answer only using reinforcement learning.
Neural Program Search: Solving Programming Tasks from Description and Examples
- Illia Polosukhin, Alexander Skidanov
- Computer ScienceInternational Conference on Learning…
- 12 February 2018
A Neural Program Search, an algorithm to generate programs from natural language description and a small number of input/output examples that significantly outperforms a sequence-to-sequence model with attention baseline is presented.
Improving Neural Program Synthesis with Inferred Execution Traces
- Richard Shin, Illia Polosukhin, D. Song
- Computer ScienceNeural Information Processing Systems
- 2018
This work splits the process into two parts: infer the trace from the input/output example, then infer the program from the trace, which leads to state-of-the-art results in program synthesis in the Karel domain.
NAPS: Natural Program Synthesis Dataset
- Maksym Zavershynskyi, Alexander Skidanov, Illia Polosukhin
- Computer ScienceArXiv
- 6 July 2018
A program synthesis-oriented dataset consisting of human written problem statements and solutions for these problems collected via crowdsourcing and extracted from human-written solutions in programming competitions, accompanied by input/output examples is presented.
TensorFlow Estimators: Managing Simplicity vs. Flexibility in High-Level Machine Learning Frameworks
- Heng-Tze Cheng, Zakaria Haque, J. Xie
- Computer ScienceKnowledge Discovery and Data Mining
- 8 August 2017
To make out of the box models flexible and usable across a wide range of problems, these canned Estimators are parameterized not only over traditional hyperparameters, but also using feature columns, a declarative specification describing how to interpret input data.
Towards Specification-Directed Program Repair
- Richard Shin, Illia Polosukhin, D. Song
- Computer ScienceInternational Conference on Learning…
- 12 February 2018
Hierarchical Question Answering for Long Documents
- Eunsol Choi, D. Hewlett, Alexandre Lacoste, Illia Polosukhin, Jakob Uszkoreit, Jonathan Berant
- Computer ScienceArXiv
- 6 November 2016
A framework for question answering is presented that can efficiently scale to longer documents while maintaining or even improving performance of state-of-the-art models and treats sentence selection as a latent variable trained jointly from the answer only using reinforcement learning.
...
...