Characterizing the Efficiency vs. Accuracy Trade-off for Long-Context NLP Models

  title={Characterizing the Efficiency vs. Accuracy Trade-off for Long-Context NLP Models},
  author={Phyllis Ang and Bhuwan Dhingra and Lisa Wu Wills},
With many real-world applications of Natural Language Processing (NLP) comprising of long texts, there has been a rise in NLP benchmarks that measure the accuracy of models that can handle longer input sequences. However, these benchmarks do not consider the trade-offs between accuracy, speed, and power consumption as input sizes or model sizes are varied. In this work, we perform a systematic study of this accuracy vs. efficiency trade-off on two widely used long-sequence models - Longformer… 
1 Citations

Figures and Tables from this paper

A Systematic Review of Green AI

With the ever-growing adoption of AI-based systems, the carbon footprint of AI is no longer negligible. AI researchers and practitioners are therefore urged to hold themselves accountable for the



QMSum: A New Benchmark for Query-based Multi-domain Meeting Summarization

This work defines a new query-based multi-domain meeting summarization task, where models have to select and summarize relevant spans of meetings in response to a query, and introduces QMSum, a new benchmark for this task.

Efficient Attentions for Long Document Summarization

Hepos, a novel efficient encoder-decoder attention with head-wise positional strides to effectively pinpoint salient information from the source is proposed, able to process ten times more tokens than existing models that use full attentions.

Efficient Transformers: A Survey

This article characterizes a large and thoughtful selection of recent efficiency-flavored “X-former” models, providing an organized and comprehensive overview of existing work and models across multiple domains.

Know What You Don’t Know: Unanswerable Questions for SQuAD

SQuadRUn is a new dataset that combines the existing Stanford Question Answering Dataset (SQuAD) with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones.

Long Range Arena: A Benchmark for Efficient Transformers

A systematic and unified benchmark, LRA, specifically focused on evaluating model quality under long-context scenarios, paves the way towards better understanding this class of efficient Transformer models, facilitates more research in this direction, and presents new challenging tasks to tackle.

Big Bird: Transformers for Longer Sequences

It is shown that BigBird is a universal approximator of sequence functions and is Turing complete, thereby preserving these properties of the quadratic, full attention model.

Longformer: The Long-Document Transformer

Following prior work on long-sequence transformers, the Longformer is evaluated on character-level language modeling and achieves state-of-the-art results on text8 and enwik8 and pretrain Longformer and finetune it on a variety of downstream tasks.

Green AI

Creating efficiency in AI research will decrease its carbon footprint and increase its inclusivity as deep learning study should not require the deepest pockets.

SummScreen : A Dataset for Abstrac - tive Screenplay Summarization

  • 2021

SummScreen: A Dataset for Abstrac

  • 2021