Compositional Questions Do Not Necessitate Multi-hop Reasoning

@inproceedings{Min2019CompositionalQD,
  title={Compositional Questions Do Not Necessitate Multi-hop Reasoning},
  author={Sewon Min and Eric Wallace and Sameer Singh and Matt Gardner and Hannaneh Hajishirzi and Luke Zettlemoyer},
  booktitle={ACL},
  year={2019}
}
Multi-hop reading comprehension (RC) questions are challenging because they require reading and reasoning over multiple paragraphs. [...] Key Method We introduce a single-hop BERT-based RC model that achieves 67 F1---comparable to state-of-the-art multi-hop models. We also design an evaluation setting where humans are not shown all of the necessary paragraphs for the intended multi-hop reasoning but can still answer over 80% of questions. Together with detailed error analysis, these results suggest there should…Expand
Do Multi-hop Readers Dream of Reasoning Chains?
TLDR
A systematic analysis to assess whether providing the full reasoning chain of multiple passages, instead of just one final passage where the answer appears, could improve the performance of the existing QA models and the necessity to develop models with better reasoning abilities. Expand
Constructing A Multi-hop QA Dataset for Comprehensive Evaluation of Reasoning Steps
TLDR
This study presents a new multi-hop QA dataset, called 2WikiMultiHopQA, which uses structured and unstructured data and introduces the evidence information containing a reasoning path forMulti-hop questions, and demonstrates that the dataset is challenging formulti-hop models and it ensures that multi-Hop reasoning is required. Expand
Final Report on Multi-hop Reading Comprehension
  • Woojeong Jin
  • 2019
Learning multi-hop reasoning has been a key challenge for reading comprehension models. Ideally, a model should not be able to perform well on a multi-hop question answering task without doingExpand
TextGraphs 2021 Shared Task on Multi-Hop Inference for Explanation Regeneration
TLDR
This edition of the shared task makes use of a large set of approximately 250k manual explanatory relevancy ratings that augment the 2020 shared task data, and performs a detailed analysis of participating systems, evaluating various aspects involved in the multi-hop inference process. Expand
TextGraphs 2019 Shared Task on Multi-Hop Inference for Explanation Regeneration
TLDR
The Shared Task on Multi-Hop Inference for Explanation Regeneration tasks participants with regenerating detailed gold explanations for standardized elementary science exam questions by selecting facts from a knowledge base of semi-structured tables. Expand
Measuring and Reducing Non-Multifact Reasoning in Multi-hop Question Answering
TLDR
An automated sufficiency-based dataset transformation that considers all possible partitions of supporting facts, capturing disconnected reasoning is introduced, formalizing this form of disconnected reasoning and proposing contrastive support sufficiency as a better test of multifact reasoning. Expand
Self-Assembling Modular Networks for Interpretable Multi-Hop Reasoning
TLDR
This work presents an interpretable, controller-based Self-Assembling Neural Modular Network for multi-hop reasoning, where four novel modules (Find, Relocate, Compare, NoOp) are designed to perform unique types of language reasoning. Expand
MuSiQue: Multi-hop Questions via Single-hop Question Composition
TLDR
This work proposes a bottom-up semi-automatic process of constructing multihop question via composition of single-hop questions, and uses this process to construct a new multi-hop QA dataset, MuSiQue-Ans, which is challenging for state-of-the-art QA models. Expand
Generating Followup Questions for Interpretable Multi-hop Question Answering
We propose a framework for answering open domain multi-hop questions in which partial information is read and used to generate followup questions, to finally be answered by a pretrained single-hopExpand
TRANSFORMER-XH: MULTI-HOP QUESTION AN-
Transformers have obtained significant success modeling natural language as a sequence of text tokens. However, in many real world scenarios, textual data inherently exhibits structures beyond aExpand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 19 REFERENCES
Understanding Dataset Design Choices for Multi-hop Reasoning
TLDR
This paper investigates two recently proposed datasets, WikiHop and HotpotQA, and explores sentence-factored models for these tasks; by design, these models cannot do multi-hop reasoning, but they are still able to solve a large number of examples in both datasets. Expand
Constructing Datasets for Multi-hop Reading Comprehension Across Documents
TLDR
A novel task to encourage the development of models for text understanding across multiple documents and to investigate the limits of existing methods, in which a model learns to seek and combine evidence — effectively performing multihop, alias multi-step, inference. Expand
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering
TLDR
It is shown that HotpotQA is challenging for the latest QA systems, and the supporting facts enable models to improve performance and make explainable predictions. Expand
The Web as a Knowledge-Base for Answering Complex Questions
TLDR
This paper proposes to decompose complex questions into a sequence of simple questions, and compute the final answer from the sequence of answers, and empirically demonstrates that question decomposition improves performance from 20.8 precision@1 to 27.5 precision @1 on this new dataset. Expand
The NarrativeQA Reading Comprehension Challenge
TLDR
A new dataset and set of tasks in which the reader must answer questions about stories by reading entire books or movie scripts are presented, designed so that successfully answering their questions requires understanding the underlying narrative rather than relying on shallow pattern matching or salience. Expand
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension
TLDR
It is shown that, in comparison to other recently introduced large-scale datasets, TriviaQA has relatively complex, compositional questions, has considerable syntactic and lexical variability between questions and corresponding answer-evidence sentences, and requires more cross sentence reasoning to find answers. Expand
Bidirectional Attention Flow for Machine Comprehension
TLDR
The BIDAF network is introduced, a multi-stage hierarchical process that represents the context at different levels of granularity and uses bi-directional attention flow mechanism to obtain a query-aware context representation without early summarization. Expand
What Makes Reading Comprehension Questions Easier?
TLDR
This study proposes to employ simple heuristics to split each dataset into easy and hard subsets and examines the performance of two baseline models for each of the subsets, and observes that the baseline performances for thehard subsets remarkably degrade compared to those of entire datasets. Expand
Semantic Parsing on Freebase from Question-Answer Pairs
TLDR
This paper trains a semantic parser that scales up to Freebase and outperforms their state-of-the-art parser on the dataset of Cai and Yates (2013), despite not having annotated logical forms. Expand
SQuAD: 100,000+ Questions for Machine Comprehension of Text
TLDR
A strong logistic regression model is built, which achieves an F1 score of 51.0%, a significant improvement over a simple baseline (20%). Expand
...
1
2
...