Few-Shot Question Answering by Pretraining Span Selection

@inproceedings{Ram2021FewShotQA,
  title={Few-Shot Question Answering by Pretraining Span Selection},
  author={Ori Ram and Yuval Kirstain and Jonathan Berant and Amir Globerson and Omer Levy},
  booktitle={ACL},
  year={2021}
}
In several question answering benchmarks, pretrained models have reached human parity through fine-tuning on an order of 100,000 annotated questions and answers. We explore the more realistic few-shot setting, where only a few hundred training examples are available, and observe that standard models perform poorly, highlighting the discrepancy between current pretraining objectives and question answering. We propose a new pretraining scheme tailored for question answering: recurring span… 

Figures and Tables from this paper

FewshotQA: A simple framework for few-shot learning of question answering tasks using pre-trained text-to-text models
TLDR
A simple fine-tuning framework that leverages pre-trained text-to-text models and is directly aligned with their pre-training framework is proposed that leads to significant gains on multiple QA benchmarks and translates well to a multilingual setting.
How Optimal is Greedy Decoding for Extractive Question Answering?
Fine-tuned language models use greedy decoding to answer reading comprehension questions with relative success. However, this approach does not ensure that the answer is a span in the given passage,
Bridging the Gap between Language Models and Cross-Lingual Sequence Labeling
TLDR
A pre-training task tailored for xSL named Cross-lingual Language Informative Span Masking (CLISM) is designed to eliminate the objective gap in a self-supervised manner and ContrAstive-Consistency Regularization (CACR) is presented, which utilizes contrastive learning to encourage the consistency between representations of input parallel sequences via unsupervised cross-lingUAL instance-wise training signals during pre- training.
ReasonBERT: Pre-trained to Reason with Distant Supervision
TLDR
This work proposes a generalized notion of distant supervision to automatically connect multiple pieces of text and tables to create pre-training examples that require long-range reasoning, and conducts a comprehensive evaluation on a variety of extractive question answering datasets that require various reasoning capabilities.
OmniTab: Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering
TLDR
This paper proposes an omnivorous pretraining approach that consumes both natural and synthetic data to endow models with these respective abilities, and performs extensive experiments to demonstrate the superiority of the model OmniTab.
PaintTeR: Automatic Extraction of Text Spans for Generating Art-Centered Questions
TLDR
To the best of the knowledge, this work is the first work to effectively fine-tune question generation models using minimal supervision for a low-resource, specialized context such as gallery visits.
Jointly Learning Span Extraction and Sequence Labeling for Information Extraction from Business Documents
TLDR
A new information extraction model for business documents that takes into account advantage of both span extraction and sequence labeling and is significantly faster than the normal span-based extraction method.
A Comprehensive Survey of Few-shot Learning: Evolution, Applications, Challenges, and Opportunities
TLDR
A novel taxonomy to classify the existing work according to the level of abstraction of knowledge in accordance with the challenges of FSL is proposed, and a set of similar concepts including few-shot learning, transfer learning, and meta-learning are compared.
ProQA: Structural Prompt-based Pre-training for Unified Question Answering
TLDR
ProQA is a unified QA paradigm that solves various tasks through a single model that takes a unified structural prompt as the bridge and improves the QA-centric ability by structural prompt-based pre-training.
Few-shot Mining of Naturally Occurring Inputs and Outputs
TLDR
This method mines naturally occurring high-quality input output pairs to mimic the style of the seed set for multiple tasks, and sees improvements of 1.46 ROUGE-L on Xsum abstractive summarization.
...
...

References

SHOWING 1-10 OF 43 REFERENCES
Know What You Don’t Know: Unanswerable Questions for SQuAD
TLDR
SQuadRUn is a new dataset that combines the existing Stanford Question Answering Dataset (SQuAD) with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones.
Adam: A Method for Stochastic Optimization
TLDR
This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
RoBERTa: A Robustly Optimized BERT Pretraining Approach
TLDR
It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.
MRQA 2019 Shared Task: Evaluating Generalization in Reading Comprehension
TLDR
In this task, 18 distinct question answering datasets were adapted and unified into the same format and the best system achieved an average F1 score of 72.5 on the 12 held-out datasets.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TLDR
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
Are You Smarter Than a Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension
TLDR
The task of Multi-Modal Machine Comprehension (M3C), which aims at answering multimodal questions given a context of text, diagrams and images, is introduced and state-of-the-art methods for textual machine comprehension and visual question answering are extended to the TQA dataset.
SpanBERT: Improving Pre-training by Representing and Predicting Spans
TLDR
The approach extends BERT by masking contiguous random spans, rather than random tokens, and training the span boundary representations to predict the entire content of the masked span, without relying on the individual token representations within it.
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension
TLDR
It is shown that, in comparison to other recently introduced large-scale datasets, TriviaQA has relatively complex, compositional questions, has considerable syntactic and lexical variability between questions and corresponding answer-evidence sentences, and requires more cross sentence reasoning to find answers.
NewsQA: A Machine Comprehension Dataset
TLDR
NewsQA, a challenging machine comprehension dataset of over 100,000 human-generated question-answer pairs, is presented and analysis confirms that NewsQA demands abilities beyond simple word matching and recognizing textual entailment.
SQuAD: 100,000+ Questions for Machine Comprehension of Text
TLDR
A strong logistic regression model is built, which achieves an F1 score of 51.0%, a significant improvement over a simple baseline (20%).
...
...