• Publications
  • Influence
CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge
TLDR
This work presents CommonsenseQA: a challenging new dataset for commonsense question answering, which extracts from ConceptNet multiple target concepts that have the same semantic relation to a single source concept. Expand
The Web as a Knowledge-Base for Answering Complex Questions
TLDR
This paper proposes to decompose complex questions into a sequence of simple questions, and compute the final answer from the sequence of answers, and empirically demonstrates that question decomposition improves performance from 20.8 precision@1 to 27.5 precision @1 on this new dataset. Expand
MRQA 2019 Shared Task: Evaluating Generalization in Reading Comprehension
TLDR
In this task, 18 distinct question answering datasets were adapted and unified into the same format and the best system achieved an average F1 score of 72.5 on the 12 held-out datasets. Expand
oLMpics-On What Language Model Pre-training Captures
TLDR
This work proposes eight reasoning tasks, which conceptually require operations such as comparison, conjunction, and composition, and findings can help future work on designing new datasets, models, and objective functions for pre-training. Expand
MultiQA: An Empirical Investigation of Generalization and Transfer in Reading Comprehension
TLDR
It is shown that training on a source RC dataset and transferring to a target dataset substantially improves performance, even in the presence of powerful contextual representations from BERT (Devlin et al., 2019). Expand
Question Answering is a Format; When is it Useful?
TLDR
It is argued that question answering should be considered a format which is sometimes useful for studying particular phenomena, not a phenomenon or task in itself. Expand
Repartitioning of the ComplexWebQuestions Dataset
TLDR
It is shown that training a RC model directly on the training data of ComplexWebQuestions reveals a leakage from the training set to the test set that allows to obtain unreasonably high performance. Expand
Comprehensive Multi-Dataset Evaluation of Reading Comprehension
TLDR
An evaluation server, ORB, is presented, that reports performance on seven diverse reading comprehension datasets, encouraging and facilitating testing a single model’s capability in understanding a wide variety of reading phenomena. Expand
Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge
TLDR
This work provides a first demonstration that LMs can be trained to reliably perform systematic reasoning combining both implicit, pre-trained knowledge and explicit natural language statements, and demonstrates that models learn to effectively perform inference which involves implicit taxonomic and world knowledge, chaining and counting. Expand
On Making Reading Comprehension More Comprehensive
TLDR
This work justifies a question answering approach to reading comprehension and describes the various kinds of questions one might use to more fully test a system’s comprehension of a passage, moving beyond questions that only probe local predicate-argument structures. Expand
...
1
2
...