CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge

@article{Talmor2019CommonsenseQAAQ,
  title={CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge},
  author={Alon Talmor and Jonathan Herzig and Nicholas Lourie and Jonathan Berant},
  journal={ArXiv},
  year={2019},
  volume={abs/1811.00937}
}
When answering a question, people often draw upon their rich world knowledge in addition to some task-specific context. [] Key Result Our best baseline, the OpenAI GPT (Radford et al., 2018), obtains 54.8% accuracy, well below human performance, which is 95.3%.

Figures and Tables from this paper

Incorporating Domain Knowledge and Semantic Information into Language Models for Commonsense Question Answering

TLDR
This work proposes an approach to incorporate domain knowledge and semantic information into language model based approaches for better understanding the related commonsense knowledge and utilizes Semantic Role Labeling to enable the system to gain a better understanding of relations among relevant entities.

A-OKVQA: A Benchmark for Visual Question Answering using World Knowledge

TLDR
This work introduces A-OKVQA, a crowdsourced dataset composed of a diverse set of about 25K questions requiring a broad base of commonsense and world knowledge to answer, and demonstrates the potential of this new dataset through a detailed analysis of its contents and baseline performance measurements over a variety of state-of-the-art vision–language models.

Improving Commonsense Question Answering by Graph-based Iterative Retrieval over Multiple Knowledge Sources

TLDR
A novel question-answering method by integrating multiple knowledge sources, i.e. ConceptNet, Wikipedia, and the Cambridge Dictionary, to boost the performance and introduces a novel graph-based iterative knowledge retrieval module, which iteratively retrieves concepts and entities related to the given question and its choices from multipleknowledge sources.

Connecting the Dots: A Knowledgeable Path Generator for Commonsense Question Answering

TLDR
This paper augments a general commonsense QA framework with a knowledgeable path generator by extrapolating over existing paths in a KG with a state-of-the-art language model, which learns to connect a pair of entities in text with a dynamic, and potentially novel, multi-hop relational path.

Reasoning Paths Generation for Commonsense Question Answering

TLDR
This paper proposes to learn a reasoning paths generator to generate structured evidence dynamically according to the questions, leveraging the tremendous unstructured knowledge stored in the language model to alleviate the incompleteness of knowledge graph.

Zero-shot Commonsense Question Answering with Cloze Translation and Consistency Optimization

TLDR
Four translation methods that can translate natural questions into cloze-style sentences to better solicit commonsense knowledge from language models are investigated, including a syntactic-based model, an unsupervised neural model, and two supervised neural models.

Knowledge-driven Data Construction for Zero-shot Evaluation in Commonsense Question Answering

TLDR
A novel neuro-symbolic framework for zero-shot question answering across commonsense tasks is proposed and it is shown that, while an individual knowledge graph is better suited for specific tasks, a global knowledge graph brings consistent gains across different tasks.

Knowledge-driven Self-supervision for Zero-shot Commonsense Question Answering

TLDR
A novel neuro-symbolic framework for zero-shot question answering across commonsense tasks is proposed and it is shown that, while an individual knowledge graph is better suited for specific tasks, a global knowledge graph brings consistent gains across different tasks.

Fusing Context Into Knowledge Graph for Commonsense Question Answering

TLDR
This work retrieves descriptions of related concepts from Wiktionary and feeds them as additional input to pretrained language models to provide contextual information for knowledge understanding and achieves state-of-the-art result in the CommonsenseQA dataset and the best result among non-generative models in OpenBookQA.

Benchmarking Knowledge-Enhanced Commonsense Question Answering via Knowledge-to-Text Transformation

TLDR
This work benchmarks knowledge-enhanced CQA by conducting extensive experiments on multiple standard C QA datasets using a simple and effective knowledge-to-text transformation framework, and shows that context-sensitive knowledge selection, heterogeneous knowledge exploitation, and commonsense-rich language models are promising CZA directions.
...

References

SHOWING 1-10 OF 44 REFERENCES

Deep Learning for Answer Sentence Selection

TLDR
This work proposes a novel approach to solving the answer sentence selection task via means of distributed representations, and learns to match questions with answers by considering their semantic encoding.

Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge

TLDR
A new question set, text corpus, and baselines assembled to encourage AI research in advanced question answering constitute the AI2 Reasoning Challenge (ARC), which requires far more powerful knowledge and reasoning than previous challenges such as SQuAD or SNLI.

SQuAD: 100,000+ Questions for Machine Comprehension of Text

TLDR
A strong logistic regression model is built, which achieves an F1 score of 51.0%, a significant improvement over a simple baseline (20%).

A Simple Method for Commonsense Reasoning

TLDR
Key to this method is the use of language models, trained on a massive amount of unlabled data, to score multiple choice questions posed by commonsense reasoning tests, which outperform previous state-of-the-art methods by a large margin.

A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories

TLDR
A new framework for evaluating story understanding and script learning: the `Story Cloze Test’, which requires a system to choose the correct ending to a four-sentence story, and a new corpus of 50k five- Sentence commonsense stories, ROCStories, to enable this evaluation.

MCScript: A Novel Dataset for Assessing Machine Comprehension Using Script Knowledge

TLDR
A large dataset of narrative texts and questions about these texts, intended to be used in a machine comprehension task that requires reasoning using commonsense knowledge, and shows that the mode of data collection via crowdsourcing results in a substantial amount of inference questions.

Improving Language Understanding by Generative Pre-Training

TLDR
The general task-agnostic model outperforms discriminatively trained models that use architectures specifically crafted for each task, improving upon the state of the art in 9 out of the 12 tasks studied.

Bidirectional Attention Flow for Machine Comprehension

TLDR
The BIDAF network is introduced, a multi-stage hierarchical process that represents the context at different levels of granularity and uses bi-directional attention flow mechanism to obtain a query-aware context representation without early summarization.

Annotation Artifacts in Natural Language Inference Data

TLDR
It is shown that a simple text categorization model can correctly classify the hypothesis alone in about 67% of SNLI and 53% of MultiNLI, and that specific linguistic phenomena such as negation and vagueness are highly correlated with certain inference classes.

SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference

TLDR
This paper introduces the task of grounded commonsense inference, unifying natural language inference and commonsense reasoning, and proposes Adversarial Filtering (AF), a novel procedure that constructs a de-biased dataset by iteratively training an ensemble of stylistic classifiers, and using them to filter the data.