Crowdsourcing Multiple Choice Science Questions
@article{Welbl2017CrowdsourcingMC, title={Crowdsourcing Multiple Choice Science Questions}, author={Johannes Welbl and Nelson F. Liu and Matt Gardner}, journal={ArXiv}, year={2017}, volume={abs/1707.06209} }
We present a novel method for obtaining high-quality, domain-targeted multiple choice questions from crowd workers. Generating these questions can be difficult without trading away originality, relevance or diversity in the answer options. Our method addresses these problems by leveraging a large corpus of domain-specific text and a small set of existing questions. It produces model suggestions for document selection and answer distractor choice which aid the human question generation process…
124 Citations
Quiz-Style Question Generation for News Stories
- Computer ScienceWWW
- 2021
This work proposes a series of novel techniques for applying large pre-trained Transformer encoder-decoder models, namely PEGASUS and T5, to the tasks of question-answer generation and distractor generation, and shows that these models outperform strong baselines using both automated metrics and human raters.
Unsupervised multiple-choice question generation for out-of-domain Q&A fine-tuning
- Computer ScienceACL
- 2022
It is shown that the state-of-the-art model UnifiedQA can greatly benefit from a rule-based algorithm that generates questions and answers from unannotated sentences on a multiple-choice benchmark about physics, biology and chemistry it has never been trained on.
Improving Question Answering with External Knowledge
- Computer ScienceEMNLP
- 2019
This work explores simple yet effective methods for exploiting two sources of externalknowledge for exploiting unstructured external knowledge for subject-area QA on multiple-choice question answering tasks in subject areas such as science.
Knowledge-Driven Distractor Generation for Cloze-style Multiple Choice Questions
- Computer ScienceAAAI
- 2021
A novel configurable framework to automatically generate distractive choices for open-domain cloze-style multiple-choice questions and shows that distractors outperforming previous methods both by automatic and human evaluation.
Answering Science Exam Questions Using Query Reformulation with Background Knowledge
- Computer ScienceAKBC
- 2019
This paper presents a system that reformulates a given question into queries that are used to retrieve supporting text from a large corpus of science-related text and outperforms several strong baselines on the ARC dataset.
Investigating Crowdsourcing to Generate Distractors for Multiple-Choice Assessments
- Computer ScienceNCS
- 2019
The results suggest that crowdsourcing can be a very useful tool in generating effective distractors (attractive to subjects who do not understand the targeted concept) and suggest that this method is faster, easier, and cheaper than is the traditional method of having one or more experts draft distractors.
Ranking Multiple Choice Question Distractors using Semantically Informed Neural Networks
- Computer ScienceCIKM
- 2020
Experimental results demonstrate the proposed CNN-BiLSTM model surpasses the performance of the existing baseline models and intelligently incorporating word level semantic information along with context specific word embeddings boost up the predictive performance of distractors, which is a promising direction for further research.
Ranking Distractors for Multiple Choice Questions
- Computer Science
- 2020
Experimental results demonstrate the proposed CNN-BiLSTM model surpasses the performance of the existing baseline models and intelligently incorporating word level semantic information along with context specific word embeddings boost up the predictive performance of distractors, which is a promising direction for further research.
Answering Science Exam Questions Using Query Rewriting with Background Knowledge
- Computer ScienceArXiv
- 2018
A system that rewrites a given question into queries that are used to retrieve supporting text from a large corpus of science-related text is presented and is able to outperform several strong baselines on the ARC dataset.
Generating Answer Candidates for Quizzes and Answer-Aware Question Generators
- Computer ScienceRANLP
- 2021
This work proposes a model that can generate a specified number of answer candidates for a given passage of text, which can then be used by instructors to write questions manually or can be passed as an input to automatic answer-aware question generators.
47 References
SQuAD: 100,000+ Questions for Machine Comprehension of Text
- Computer ScienceEMNLP
- 2016
A strong logistic regression model is built, which achieves an F1 score of 51.0%, a significant improvement over a simple baseline (20%).
Good Question! Statistical Ranking for Question Generation
- Computer ScienceNAACL
- 2010
This work uses manually written rules to perform a sequence of general purpose syntactic transformations to turn declarative sentences into questions, which are ranked by a logistic regression model trained on a small, tailored dataset consisting of labeled output from the system.
Who did What: A Large-Scale Person-Centered Cloze Dataset
- Computer ScienceEMNLP
- 2016
A new "Who-did-What" dataset of over 200,000 fill-in-the-gap (cloze) multiple choice reading comprehension problems constructed from the LDC English Gigaword newswire corpus is constructed and proposed as a challenge task for the community.
Large-scale Simple Question Answering with Memory Networks
- Computer ScienceArXiv
- 2015
This paper studies the impact of multitask and transfer learning for simple question answering; a setting for which the reasoning required to answer is quite easy, as long as one can retrieve the correct evidence given a question, which can be difficult in large-scale conditions.
Text Understanding with the Attention Sum Reader Network
- Computer ScienceACL
- 2016
A new, simple model is presented that uses attention to directly pick the answer from the context as opposed to computing the answer using a blended representation of words in the document as is usual in similar models, making the model particularly suitable for question-answering problems where the answer is a single word from the document.
WikiQA: A Challenge Dataset for Open-Domain Question Answering
- Computer ScienceEMNLP
- 2015
The WIKIQA dataset is described, a new publicly available set of question and sentence pairs, collected and annotated for research on open-domain question answering, which is more than an order of magnitude larger than the previous dataset.
Automatic Generation of Challenging Distractors Using Context-Sensitive Inference Rules
- Computer ScienceBEA@ACL
- 2014
This work proposes to employ context-sensitive lexical inference rules in order to generate distractors that are semantically similar to the gap target word in some sense, but not in the particular sense induced by the gap-fill context.
MS MARCO: A Human Generated MAchine Reading COmprehension Dataset
- Computer ScienceCoCo@NIPS
- 2016
This new dataset is aimed to overcome a number of well-known weaknesses of previous publicly available datasets for the same task of reading comprehension and question answering, and is the most comprehensive real-world dataset of its kind in both quantity and quality.
Answering Elementary Science Questions by Constructing Coherent Scenes using Background Knowledge
- Computer ScienceEMNLP
- 2015
This work shows that by using a simple “knowledge graph” representation of the question, it can leverage several large-scale linguistic resources to provide missing background knowledge, somewhat alleviating the knowledge bottleneck in previous approaches.
A Selection Strategy to Improve Cloze Question Quality
- Computer Science
- 2008
We present a strategy to improve the quality of automatically generated cloze and open cloze questions which are used by the REAP tutoring system for assessment in the ill-defined domain of English…