Low-Resource Dialogue Summarization with Domain-Agnostic Multi-Source Pretraining

  title={Low-Resource Dialogue Summarization with Domain-Agnostic Multi-Source Pretraining},
  author={Yicheng Zou and Bolin Zhu and Xingwu Hu and Tao Gui and Qi Zhang},
With the rapid increase in the volume of dialogue data from daily life, there is a growing demand for dialogue summarization. Unfortunately, training a large summarization model is generally infeasible due to the inadequacy of dialogue data with annotated summaries. Most existing works for low-resource dialogue summarization directly pretrain models in other domains, e.g., the news domain, but they generally neglect the huge difference between dialogues and conventional articles. To bridge the… 

Figures and Tables from this paper

ADPL: Adversarial Prompt-based Domain Adaptation for Dialogue Summarization with Knowledge Disentanglement

An efficient Adversarial Disentangled Prompt Learning (ADPL) model for domain adaptation in dialogue summarization and three kinds of prompts including domain-invariant prompt, domain-specific prompt, and task-oriented prompt are introduced.

Domain-Oriented Prefix-Tuning: Towards Efficient and Generalizable Fine-tuning for Zero-Shot Dialogue Summarization

An efficient and generalizable Domain-Oriented Prefix-tuning model, which utilizes a domain word initialized prefix module to alleviate domain entanglement and adopts discrete prompts to guide the model to focus on key contents of dialogues and enhance model generalization is proposed.

Abstractive Meeting Summarization: A Survey

A survey of the challenges, datasets and systems relevant to this task and a discussion of promising directions for future study on abstractive summarization for multi-party meetings are provided.

Heuristic-based Inter-training to Improve Few-shot Multi-perspective Dialog Summarization

This work study the multi-perspective summarization of customer-care conversations between support agents and customers finds that there are different heuristics that are associated with summaries of different perspectives, and explores these heuristic to cre-ate weak-labeled data for intermediate training of the models before intermediate training with scarce human annotated summaries.

A Survey of Pretrained Language Models Based Text Generation

This survey presents the recent advances achieved in the topic of PLMs for text generation and introduces three key points of applying PLMs to text generation: how to encode the input data as representations preserving input semantics which can be fused into PLMs.

Pretrained Language Models for Text Generation: A Survey

This paper presents an overview of the major advances achieved in the topic of pretrained language models for text generation and discusses how to adapt existing PLMs to model different input data and satisfy special properties in the generated text.

DialSummEval: Revisiting Summarization Evaluation for Dialogues

In this paper, 18 categories of metrics are re-evaluate in terms of four dimensions: coherence, consistency, fluency and relevance, as well as a unified human evaluation of various models for the first time.

GTrans: Grouping and Fusing Transformer Layers for Neural Machine Translation

Experimental and analytical results demonstrate that the proposed Group-Transformer model outperforms its Transformer counterparts by a consistent gain and can be successfully scaled up to 60 encoder layers and 36 decoder layers.


  • Lulu ZhaoFujia Zheng Weiran Xu
  • Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
  • 2022



A Hierarchical Network for Abstractive Meeting Summarization with Cross-Domain Pretraining

A novel abstractive summary network that adapts to the meeting scenario is proposed with a hierarchical structure to accommodate long meeting transcripts and a role vector to depict the difference among speakers.

AdaptSum: Towards Low-Resource Domain Adaptation for Abstractive Summarization

A study of domain adaptation for the abstractive summarization task across six diverse target domains in a low-resource setting and finds that continuing pre-training could lead to the pre-trained model's catastrophic forgetting, and a learning method with less forgetting can alleviate this issue.

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

This work proposes pre-training large Transformer-based encoder-decoder models on massive text corpora with a new self-supervised objective, PEGASUS, and demonstrates it achieves state-of-the-art performance on all 12 downstream datasets measured by ROUGE scores.

Improving Abstractive Dialogue Summarization with Graph Structures and Topic Words

A Topic-word Guided Dialogue Graph Attention (TGDGA) network is proposed to model the dialogue as an interaction graph according to the topic word information and a masked graph self-attention mechanism is used to integrate cross-sentence information flows and focus more on the related utterances, which makes it better to understand the dialogue.

Exploring Domain Shift in Extractive Text Summarization

This paper extends the conventional definition of the domain from categories into data sources for the text summarization task, then re-purpose a multi-domain summarization dataset and verifies how the gap between different domains influences the performance of neural summarization models.

Incorporating Commonsense Knowledge into Abstractive Dialogue Summarization via Heterogeneous Graph Networks

A novel multi-speaker dialogue summarizer is presented to demonstrate how large-scale commonsense knowledge can facilitate dialogue understanding and summary generation and designs a Dialogue Heterogeneous Graph Network (D-HGN) for modeling both information.

Generative Adversarial Network with Policy Gradient for Text Summarization

Qualitatively and quantitatively experimental results show that the proposed generative adversarial networks model can generate more relevant, less repetitive, grammatically correct, preferable by humans and is promising in solving the abstractive text summarization task.

Text Summarization with Pretrained Encoders

This paper introduces a novel document-level encoder based on BERT which is able to express the semantics of a document and obtain representations for its sentences and proposes a new fine-tuning schedule which adopts different optimizers for the encoder and the decoder as a means of alleviating the mismatch between the two.

Augmenting Neural Response Generation with Context-Aware Topical Attention

This work introduces a Topical Hierarchical Recurrent Encoder Decoder (THRED), a novel, fully data-driven, multi-turn response generation system intended to produce contextual and topic-aware responses.

Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting

An accurate and fast summarization model that first selects salient sentences and then rewrites them abstractively to generate a concise overall summary is proposed, which achieves the new state-of-the-art on all metrics on the CNN/Daily Mail dataset, as well as significantly higher abstractiveness scores.