Share This Author
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies
- Mor Geva, Daniel Khashabi, Elad Segal, Tushar Khot, D. Roth, Jonathan Berant
- Computer ScienceTransactions of the Association for Computational…
- 6 January 2021
This work introduces StrategyQA, a question answering benchmark where the required reasoning steps are implicit in the question, and should be inferred using a strategy, and proposes a data collection procedure that combines term-based priming to inspire annotators, careful control over the annotator population, and adversarial filtering for eliminating reasoning shortcuts.
A Simple and Effective Model for Answering Multi-span Questions
- Elad Segal, Avia Efrat, Mor Shoham, A. Globerson, Jonathan Berant
- Computer ScienceEMNLP
- 29 September 2019
This work suggests a new approach for tackling multi-span questions, based on sequence tagging, which differs from previous approaches for answering span questions, and shows that this approach leads to an absolute improvement and slightly eclipses the current state-of-the-art results on the entire DROP dataset.
SCROLLS: Standardized CompaRison Over Long Language Sequences
This work introduces SCROLLS, a suite of tasks that require reasoning over long texts, and examines existing long-text datasets, and handpick ones where the text is naturally long, while prioritizing tasks that involve synthesizing information across the input.
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
Evaluation of OpenAI's GPT models, Google-internal dense transformer architectures, and Switch-style sparse transformers on BIG-bench, across model sizes spanning millions to hundreds of billions of parameters finds that model performance and calibration both improve with scale, but are poor in absolute terms.