SummaReranker: A Multi-Task Mixture-of-Experts Re-ranking Framework for Abstractive Summarization

@inproceedings{Ravaut2022SummaRerankerAM,
  title={SummaReranker: A Multi-Task Mixture-of-Experts Re-ranking Framework for Abstractive Summarization},
  author={Mathieu Ravaut and Shafiq R. Joty and Nancy F. Chen},
  booktitle={ACL},
  year={2022}
}
Sequence-to-sequence neural networks have recently achieved great success in abstractive summarization, especially through fine-tuning large pre-trained language models on the downstream dataset. These models are typically decoded with beam search to generate a unique summary. However, the search space is very large, and with the exposure bias, such decoding is not optimal. In this paper, we show that it is possible to directly train a second-stage model performing re-ranking on a set of… 

Joint Generator-Ranker Learning for Natural Language Generation

TLDR
The generate-thenrank framework is revisited and a joint generator-ranker (JGR) training algorithm for text generation tasks is proposed, which achieves new state-of-the-art performance on five public benchmarks covering three popular generation tasks: summarization, question generation, and response generation.

MVP: Multi-task Supervised Pre-training for Natural Language Generation

TLDR
This work proposes M ulti-task super V ised P re-training ( MVP) for natural language generation, and collects a labeled pre-training corpus from 45 datasets over seven generation tasks to pre-train the text generation model MVP.

References

SHOWING 1-10 OF 63 REFERENCES

SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization

TLDR
SimCLS can bridge the gap between the learning objective and evaluation metrics resulting from the currently dominated sequence-to-sequence learning framework by formulating text generation as a reference-free evaluation problem assisted by contrastive learning.

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

TLDR
This work proposes pre-training large Transformer-based encoder-decoder models on massive text corpora with a new self-supervised objective, PEGASUS, and demonstrates it achieves state-of-the-art performance on all 12 downstream datasets measured by ROUGE scores.

Get To The Point: Summarization with Pointer-Generator Networks

TLDR
A novel architecture that augments the standard sequence-to-sequence attentional model in two orthogonal ways, using a hybrid pointer-generator network that can copy words from the source text via pointing, which aids accurate reproduction of information, while retaining the ability to produce novel words through the generator.

RankQA: Neural Question Answering with Answer Re-Ranking

TLDR
RankQA is proposed: RankQA extends the conventional two-stage process in neural QA with a third stage that performs an additional answer re-ranking, and represents a novel, powerful, and thus challenging baseline for future research in content-based QA.

Abstractive Summarization of Reddit Posts with Multi-level Memory Networks

TLDR
This work collects Reddit TIFU dataset, consisting of 120K posts from the online discussion forum Reddit, and proposes a novel abstractive summarization model named multi-level memory networks (MMN), equipped with multi- level memory to store the information of text from different levels of abstraction.

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

TLDR
This systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks and achieves state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more.

Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond

TLDR
This work proposes several novel models that address critical problems in summarization that are not adequately modeled by the basic architecture, such as modeling key-words, capturing the hierarchy of sentence-to-word structure, and emitting words that are rare or unseen at training time.

Passage Re-ranking with BERT

TLDR
A simple re-implementation of BERT for query-based passage re-ranking on the TREC-CAR dataset and the top entry in the leaderboard of the MS MARCO passage retrieval task, outperforming the previous state of the art by 27% in MRR@10.

Alleviating Exposure Bias via Contrastive Learning for Abstractive Text Summarization

TLDR
This work proposes to leverage contrastive learning to decrease the likelihood of these low-quality summaries, and meanwhile increaseThe likelihood of the gold summary, and experimentally demonstrates that the method effectively improves the performance of the state-of-the-art model on different datasets.

Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization

TLDR
A novel abstractive model is proposed which is conditioned on the article’s topics and based entirely on convolutional neural networks, outperforming an oracle extractive system and state-of-the-art abstractive approaches when evaluated automatically and by humans.
...