Corpus ID: 237491581

Entity-Based Knowledge Conflicts in Question Answering

@inproceedings{Longpre2021EntityBasedKC,
  title={Entity-Based Knowledge Conflicts in Question Answering},
  author={Shayne Longpre and Kartik Kumar Perisetla and Anthony Chen and Nikhil Ramesh and Chris DuBois and Sameer Singh},
  booktitle={EMNLP},
  year={2021}
}
Knowledge-dependent tasks typically use two sources of knowledge: parametric, learned at training time, and contextual, given as a passage at inference time. To understand how models use these sources together, we formalize the problem of knowledge conflicts, where the contextual information contradicts the learned information. Analyzing the behaviour of popular models, we measure their over-reliance on memorized information (the cause of hallucinations), and uncover important factors that… Expand
Towards Continual Knowledge Learning of Language Models
TLDR
This work constructs a new benchmark and metric to quantify the retention of time-invariant world knowledge, the update of outdated knowledge, and the acquisition of new knowledge in Continual Knowledge Learning. Expand
On the Robustness of Reading Comprehension Models to Entity Renaming
  • Jun Yan, Yang Xiao, Sagnik Mukherjee, Bill Yuchen Lin, Robin Jia, Xiang Ren
  • Computer Science
  • ArXiv
  • 2021
TLDR
This work proposes a general and scalable method to replace person names with names from a variety of sources, ranging from common English names to names from other languages to arbitrary strings, and finds that they can further improve the robustness of MRC models. Expand
Retrieval-guided Counterfactual Generation for QA
TLDR
This work develops a Retrieve-GenerateFilter technique to create counterfactual evaluation and training data with minimal human supervision, and finds that RGF data leads to significant improvements in a model’s robustness to local perturbations. Expand
ContraQA: Question Answering under Contradicting Contexts
TLDR
A misinformation-aware QA system is built as a counter-measure that integrates question answering and misinformation detection in a joint fashion to defend against the threat of misinformation. Expand

References

SHOWING 1-10 OF 33 REFERENCES
Language Models as Knowledge Bases?
TLDR
An in-depth analysis of the relational knowledge already present (without fine-tuning) in a wide range of state-of-the-art pretrained language models finds that BERT contains relational knowledge competitive with traditional NLP methods that have some access to oracle knowledge. Expand
REALM: Retrieval-Augmented Language Model Pre-Training
TLDR
The effectiveness of Retrieval-Augmented Language Model pre-training (REALM) is demonstrated by fine-tuning on the challenging task of Open-domain Question Answering (Open-QA) and is found to outperform all previous methods by a significant margin, while also providing qualitative benefits such as interpretability and modularity. Expand
Hurdles to Progress in Long-form Question Answering
TLDR
The task formulation raises fundamental challenges regarding evaluation and dataset creation that currently preclude meaningful modeling progress, and a new system that relies on sparse attention and contrastive retriever learning to achieve state-of-the-art performance on the ELI5 LFQA dataset is designed. Expand
An Exploration of Data Augmentation and Sampling Techniques for Domain-Agnostic Question Answering
TLDR
This work investigates the relative benefits of large pre-trained language models, various data sampling strategies, as well as query and context paraphrases generated by back-translation, and finds a simple negative sampling technique to be particularly effective. Expand
Question and Answer Test-Train Overlap in Open-Domain Question Answering Datasets
TLDR
A detailed study of the test sets of three popular open-domain benchmark datasets finds that 30% of test-set questions have a near-duplicate paraphrase in their corresponding train sets, and that simple nearest-neighbor models outperform a BART closed-book QA model. Expand
SQuAD: 100,000+ Questions for Machine Comprehension of Text
TLDR
A strong logistic regression model is built, which achieves an F1 score of 51.0%, a significant improvement over a simple baseline (20%). Expand
Reading Wikipedia to Answer Open-Domain Questions
TLDR
This approach combines a search component based on bigram hashing and TF-IDF matching with a multi-layer recurrent neural network model trained to detect answers in Wikipedia paragraphs, indicating that both modules are highly competitive with respect to existing counterparts. Expand
NewsQA: A Machine Comprehension Dataset
TLDR
NewsQA, a challenging machine comprehension dataset of over 100,000 human-generated question-answer pairs, is presented and analysis confirms that NewsQA demands abilities beyond simple word matching and recognizing textual entailment. Expand
Retrieval-Augmented Controllable Review Generation
TLDR
This paper proposes to additionally leverage references, which are selected from a large pool of texts labeled with one of the attributes, as textual information that enriches inductive biases of given attributes. Expand
Evaluating Entity Disambiguation and the Role of Popularity in Retrieval-Based NLP
TLDR
It is found that the retrievers exhibit popularity bias, significantly under-performing on rarer entities that share a name, e.g., they are twice as likely to retrieve erroneous documents on queries for the less popular entity under the same name. Expand
...
1
2
3
4
...