Share This Author
CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge
This work presents CommonsenseQA: a challenging new dataset for commonsense question answering, which extracts from ConceptNet multiple target concepts that have the same semantic relation to a single source concept.
The Web as a Knowledge-Base for Answering Complex Questions
This paper proposes to decompose complex questions into a sequence of simple questions, and compute the final answer from the sequence of answers, and empirically demonstrates that question decomposition improves performance from 20.8 precision@1 to 27.5 precision @1 on this new dataset.
MRQA 2019 Shared Task: Evaluating Generalization in Reading Comprehension
- Adam Fisch, Alon Talmor, Robin Jia, Minjoon Seo, Eunsol Choi, Danqi Chen
- Computer ScienceEMNLP
- 22 October 2019
In this task, 18 distinct question answering datasets were adapted and unified into the same format and the best system achieved an average F1 score of 72.5 on the 12 held-out datasets.
oLMpics-On What Language Model Pre-training Captures
- Alon Talmor, Yanai Elazar, Yoav Goldberg, Jonathan Berant
- Computer ScienceTransactions of the Association for Computational…
- 31 December 2019
This work proposes eight reasoning tasks, which conceptually require operations such as comparison, conjunction, and composition, and findings can help future work on designing new datasets, models, and objective functions for pre-training.
MultiQA: An Empirical Investigation of Generalization and Transfer in Reading Comprehension
It is shown that training on a source RC dataset and transferring to a target dataset substantially improves performance, even in the presence of powerful contextual representations from BERT (Devlin et al., 2019).
Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge
- Alon Talmor, Oyvind Tafjord, Peter Clark, Yoav Goldberg, Jonathan Berant
- Computer ScienceNeurIPS
- 11 June 2020
This work provides a first demonstration that LMs can be trained to reliably perform systematic reasoning combining both implicit, pre-trained knowledge and explicit natural language statements, and demonstrates that models learn to effectively perform inference which involves implicit taxonomic and world knowledge, chaining and counting.
ORB: An Open Reading Benchmark for Comprehensive Evaluation of Machine Reading Comprehension
- Dheeru Dua, Ananth Gottumukkala, Alon Talmor, Sameer Singh, Matt Gardner
- Computer ScienceArXiv
- 29 December 2019
An evaluation server, ORB, is presented, that reports performance on seven diverse reading comprehension datasets, encouraging and facilitating testing a single model's capability in understanding a wide variety of reading phenomena.
MultiModalQA: Complex Question Answering over Text, Tables and Images
This paper creates MMQA, a challenging question answering dataset that requires joint reasoning over text, tables and images, and defines a formal language that allows it to take questions that can be answered from a single modality, and combine them to generate cross-modal questions.
CommonsenseQA 2.0: Exposing the Limits of AI through Gamification
This work proposes gamiﬁcation as a framework for data construction and creates CommonsenseQA 2.0, which includes 14,343 yes/no questions, and demonstrates its difﬂculty for models that are orders-of-magnitude larger than the AI used in the game itself.
Repartitioning of the ComplexWebQuestions Dataset
It is shown that training a RC model directly on the training data of ComplexWebQuestions reveals a leakage from the training set to the test set that allows to obtain unreasonably high performance.