Dynamic Sampling Strategies for Multi-Task Reading Comprehension

  title={Dynamic Sampling Strategies for Multi-Task Reading Comprehension},
  author={Ananth Gottumukkala and Dheeru Dua and Sameer Singh and Matt Gardner},
  booktitle={Annual Meeting of the Association for Computational Linguistics},
Building general reading comprehension systems, capable of solving multiple datasets at the same time, is a recent aspirational goal in the research community. Prior work has focused on model architecture or generalization to held out datasets, and largely passed over the particulars of the multi-task learning set up. We show that a simple dynamic sampling strategy, selecting instances for training proportional to the multi-task model’s current performance on a dataset relative to its single… 

Figures and Tables from this paper

Single-dataset Experts for Multi-dataset Question Answering

This work trains a collection of lightweight, dataset-specific adapter modules that share an underlying Transformer model, and finds that these Multi-Adapter Dataset Experts (MADE) outperform all the authors' baselines in terms of in-distribution accuracy, and simple methods based on parameter-averaging lead to better zero-shot generalization and few-shot transfer performance.

All Birds with One Stone: Multi-task Text Classification for Efficient Inference with One Forward Pass

This work proposes a scalable method that can achieve stronger performance with close to O (1) computation cost via only one forward pass for multitask transformer models at the serving time.

Successive Prompting for Decomposing Complex Questions

A way to generate synthetic dataset which can be used to bootstrap model’s ability to decompose and answer intermediate questions is introduced and achieves an improvement in F1 of ~5% when compared with a state-of-the-art model with synthetic augmentations and few-shot version of the DROP dataset.

Multi-Task Learning in Natural Language Processing: An Overview

An overview of the use of MTL in NLP tasks is given and optimization techniques on loss construction, data sampling, and task scheduling to properly train a multi-task model are presented.

Data pseudo-labeling while adapting BERT for multitask approaches

This article addressed the issue of the lack of labels for different tasks while treating multitask settings as a one-task multilabel setting, exploring eight different data pseudo-labeling approaches in the GLUE 4-task setting.

Tricks for Training Sparse Translation Models

This work finds that that sparse architectures for multilingual machine translation can perform poorly out of the box and proposes two straightforward techniques to mitigate this — a temperature heating mechanism and dense pre-training.

Modality-specific Learning Rates for Effective Multimodal Additive Late-fusion

A Modality-Specific Learning Rate (MSLR) method to effectively build late-fusion multimodal models from fine-tuned unimmodal models is proposed and experiments show that MSLR outperforms global learning rates on multiple tasks and settings, and enables the models to effectively learn each modality.

MUSER: MUltimodal Stress detection using Emotion Recognition as an Auxiliary Task

This work proposes MUSER – a transformer-based model architecture and a novel multi-task learning algorithm with speed-based dynamic sampling strategy that is effective for stress detection with both internal and external auxiliary tasks, and achieves state-of-the-art results.

Let the Model Decide its Curriculum for Multitask Learning

This work proposes two classes of techniques to arrange training instances into a learning curriculum based on difficulty scores computed via model-based approaches and shows that instance-level and dataset-level techniques result in strong representations as they lead to an average performance improvement over their respective base-lines.

Turning Tables: Generating Examples from Semi-structured Tables for Endowing Language Models with Reasoning Skills

This work proposes to leverage semi-structured tables, and automatically generate at scale question-paragraph pairs, where answering the question requires reasoning over multiple facts in the paragraph, and shows that the model, PReasM, substantially outperforms T5, a popular pre-trained encoder-decoder model.



Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension

A novel sample re-weighting scheme to assign sample-specific weights to the loss of a joint Machine Reading Comprehension (MRC) model that can be applied to a wide range of MRC tasks in different domains.

DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs

A new reading comprehension benchmark, DROP, which requires Discrete Reasoning Over the content of Paragraphs, and presents a new model that combines reading comprehension methods with simple numerical reasoning to achieve 51% F1.

ORB: An Open Reading Benchmark for Comprehensive Evaluation of Machine Reading Comprehension

An evaluation server, ORB, is presented, that reports performance on seven diverse reading comprehension datasets, encouraging and facilitating testing a single model's capability in understanding a wide variety of reading phenomena.

MultiQA: An Empirical Investigation of Generalization and Transfer in Reading Comprehension

It is shown that training on a source RC dataset and transferring to a target dataset substantially improves performance, even in the presence of powerful contextual representations from BERT (Devlin et al., 2019).

MRQA 2019 Shared Task: Evaluating Generalization in Reading Comprehension

In this task, 18 distinct question answering datasets were adapted and unified into the same format and the best system achieved an average F1 score of 72.5 on the 12 held-out datasets.

Online Multi-Task Learning Using Active Sampling

This work proposes a simple yet efficient multi-task learning framework which solves multiple goal-directed tasks in an online or active learning setup without the need for expert supervision.

A Hierarchical Multi-task Approach for Learning Embeddings from Semantic Tasks

A hierarchical model trained in a multi-task learning setup on a set of carefully selected semantic tasks achieves state-of-the-art results on a number of tasks, namely Named Entity Recognition, Entity Mention Detection and Relation Extraction without hand-engineered features or external NLP tools like syntactic parsers.

The NarrativeQA Reading Comprehension Challenge

A new dataset and set of tasks in which the reader must answer questions about stories by reading entire books or movie scripts are presented, designed so that successfully answering their questions requires understanding the underlying narrative rather than relying on shallow pattern matching or salience.

Quoref: A Reading Comprehension Dataset with Questions Requiring Coreferential Reasoning

This work presents a new crowdsourced dataset containing more than 24K span-selection questions that require resolving coreference among entities in over 4.7K English paragraphs from Wikipedia, and shows that state-of-the-art reading comprehension models perform significantly worse than humans on this benchmark.

SQuAD: 100,000+ Questions for Machine Comprehension of Text

A strong logistic regression model is built, which achieves an F1 score of 51.0%, a significant improvement over a simple baseline (20%).