• Publications
  • Influence
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TLDR
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks. Expand
Natural Questions: A Benchmark for Question Answering Research
TLDR
The Natural Questions corpus, a question answering data set, is presented, introducing robust metrics for the purposes of evaluating question answering systems; demonstrating high human upper bounds on these metrics; and establishing baseline results using competitive methods drawn from related literature. Expand
Semantic Parsing via Staged Query Graph Generation: Question Answering with Knowledge Base
TLDR
This work proposes a novel semantic parsing framework for question answering using a knowledge base that leverages the knowledge base in an early stage to prune the search space and thus simplifies the semantic matching problem. Expand
Latent Retrieval for Weakly Supervised Open Domain Question Answering
TLDR
It is shown for the first time that it is possible to jointly learn the retriever and reader from question-answer string pairs and without any IR system, and outperforming BM25 by up to 19 points in exact match. Expand
A Knowledge-Grounded Neural Conversation Model
TLDR
A novel, fully data-driven, and knowledge-grounded neural conversation model aimed at producing more contentful responses that generalizes the widely-used Sequence-to-Sequence (seq2seq) approach by conditioning responses on both conversation history and external “facts”, allowing the model to be versatile and applicable in an open-domain setting. Expand
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
TLDR
It is found that transferring from entailment data is more effective than transferring from paraphrase or extractive QA data, and that it, surprisingly, continues to be very beneficial even when starting from massive pre-trained language models such as BERT. Expand
REALM: Retrieval-Augmented Language Model Pre-Training
TLDR
The effectiveness of Retrieval-Augmented Language Model pre-training (REALM) is demonstrated by fine-tuning on the challenging task of Open-domain Question Answering (Open-QA) and is found to outperform all previous methods by a significant margin, while also providing qualitative benefits such as interpretability and modularity. Expand
Guiding Semi-Supervision with Constraint-Driven Learning
TLDR
The experimental results presented in the information extraction domain demonstrate that applying constraints helps the model to generate better feedback during learning, and hence the framework allows for high performance learning with significantly less training data than was possible before on these tasks. Expand
Question Answering Using Enhanced Lexical Semantic Models
TLDR
This work focuses on improving the performance using models of lexical semantic resources and shows that these systems can be consistently and significantly improved with rich lexical semantics information, regardless of the choice of learning algorithms. Expand
Driving Semantic Parsing from the World’s Response
TLDR
This paper develops two novel learning algorithms capable of predicting complex structures which only rely on a binary feedback signal based on the context of an external world and reformulates the semantic parsing problem to reduce the dependency of the model on syntactic patterns, thus allowing the parser to scale better using less supervision. Expand
...
1
2
3
4
5
...