• Publications
  • Influence
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designedExpand
  • 8,805
  • 2984
Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network
We present a new part-of-speech tagger that demonstrates the following ideas: (i) explicit use of both preceding and following tag contexts via a dependency network representation, (ii) broad use ofExpand
  • 3,123
  • 256
Representing Text for Joint Embedding of Text and Knowledge Bases
Models that learn to represent textual and knowledge base relations in the same continuous latent space are able to perform joint inferences among the two kinds of relations and obtain high accuracyExpand
  • 303
  • 62
Observed versus latent features for knowledge base and text inference
In this paper we show the surprising effectiveness of a simple observed features model in comparison to latent feature models on two benchmark knowledge base completion datasets, FB15K and WN18. WeExpand
  • 215
  • 60
Cross-Sentence N-ary Relation Extraction with Graph LSTMs
Past work in relation extraction has focused on binary relations in single sentences. Recent NLP inroads in high-value domains have sparked interest in the more general setting of extracting n-aryExpand
  • 174
  • 29
Natural Questions: A Benchmark for Question Answering Research
We present the Natural Questions corpus, a question answering data set. Questions consist of real anonymized, aggregated queries issued to the Google search engine. An annotator is presented with aExpand
  • 194
  • 26
Latent Retrieval for Weakly Supervised Open Domain Question Answering
Recent work on open domain question answering (QA) assumes strong supervision of the supporting evidence and/or assumes a blackbox information retrieval (IR) system to retrieve evidence candidates.Expand
  • 67
  • 24
Extracting Parallel Sentences from Comparable Corpora using Document Level Alignment
The quality of a statistical machine translation (SMT) system is heavily dependent upon the amount of parallel sentences used in training. In recent years, there have been several approachesExpand
  • 223
  • 23
Translingual Document Representations from Discriminative Projections
Representing documents by vectors that are independent of language enhances machine translation and multilingual text categorization. We use discriminative training to create a projection ofExpand
  • 116
  • 21
The LinGO Redwoods Treebank: Motivation and Preliminary Applications
The LinGO Redwoods initiative is a seed activity in the design and development of a new type of treebank. While several medium- to large-scale treebanks exist for English (and for other majorExpand
  • 134
  • 21