Structural Pre-training for Dialogue Comprehension

@inproceedings{Zhang2021StructuralPF,
  title={Structural Pre-training for Dialogue Comprehension},
  author={Zhuosheng Zhang and Hai Zhao},
  booktitle={Annual Meeting of the Association for Computational Linguistics},
  year={2021}
}
Pre-trained language models (PrLMs) have demonstrated superior performance due to their strong ability to learn universal language representations from self-supervised pre-training. However, even with the help of the powerful PrLMs, it is still challenging to effectively capture task-related knowledge from dialogue texts which are enriched by correlations among speaker-aware utterances. In this work, we present SPIDER, Structural Pre-traIned DialoguE Reader, to capture dialogue exclusive… 

Figures and Tables from this paper

Semantic-based Pre-training for Dialogue Understanding

A semantic-based pre-training framework that extends the standard pre- training framework by three tasks for learning 1) core semantic units, 2) semantic relations and 3) the overall semantic representation according to AMR graphs is proposed.

Advances in Multi-turn Dialogue Comprehension: A Survey

The characteristics and challenges of dialogue comprehension in contrast to plaintext reading comprehension are summarized and three typical patterns of dialogue modeling that are widely-used in dialogue comprehension tasks such as response selection and conversation questionanswering are discussed.

Structural Characterization for Dialogue Disentanglement

This work specially takes structure factors into account and design a novel model for dialogue disentangling that achieves new state-of-the-art on the Ubuntu IRC benchmark dataset and contributes to dialogue-related comprehension.

Unified Knowledge Prompt Pre-training for Customer Service Dialogues

All the tasks of customer service dialogues are formulated as a unified text-to-text generation task and a knowledge-driven prompt strategy to jointly learn from a mixture of distinct dialogue tasks is introduced.

Speaker-Aware Discourse Parsing on Multi-Party Dialogues

A speaker-aware model forourse parsing on multi-party dialogues, using the interaction features between different speakers, and a second-stage pre-training task, same speaker prediction (SSP), enhancing the conversational context representations by predicting whether two utterances are from the same speaker.

Two-Level Supervised Contrastive Learning for Response Selection in Multi-Turn Dialogue

A new method for supervised contrastive learning is developed, employed in response 017 selection in multi-turn dialogue, and results on three benchmark datasets suggest that the proposed method significantly outperforms the Contrastive learning baseline 024 and the state-of-the-art methods for the task.

Sparse and Dense Approaches for the Full-rank Retrieval of Responses for Dialogues

This paper investigates both dialogue context and response expansion techniques for sparse retrieval, as well as zero-shot and fine-tuned dense retrieval approaches, and finds the best performing method overall to be dense retrieval with intermediate training—a step after the language model pre-training where sentence representations are learned—followed by fine- Tuning on the target conversational data.

Task Compass: Scaling Multi-task Pre-training with Task Prefix

This work proposes a task prefix guided multi-task pre-training framework to explore the relationships among tasks, which can not only serve as the strong foundation backbone for a wide range of tasks but also be feasible as a probing tool for analyzing task relationships.

Attend, Select and Eliminate: Accelerating Multi-turn Response Selection with Dual-attention-based Content Elimination

  • Computer Science
  • 2022
A post-training strategy is introduced to mitigate the training-inference gap posed by content elimination, which can effectively speeds-up SOTA models without much performance degradation and shows a better trade-off between speed and performance than previous methods.

Structure Inducing Pre-Training

Relative reduction of error (RRE) of models trained under the authors' framework vs. published per-token or per-sample baselines indicates models under the framework reduce error more and thus outperform baselines.

References

SHOWING 1-10 OF 63 REFERENCES

Fine-grained Post-training for Improving Retrieval-based Dialogue Systems

A new fine-grained post-training method is proposed that reflects the characteristics of the multi-turn dialogue and achieves new state-of-the-art results on three benchmark datasets, suggesting that this model is highly effective for the response selection task.

DialogBERT: Discourse-Aware Response Generation via Learning to Recover and Rank Utterances

Experiments show that this approach remarkably outperforms three baselines, such as BART and DialoGPT, in terms of quantitative evaluation, and the human evaluation suggests that DialogBERT generates more coherent, informative, and human-like responses than the baselines with significant margins.

PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable

This work proposes a novel dialogue generation pre-training framework to support various kinds of conversations, including chit-chat, knowledge grounded dialogues, and conversational question answering, and introduces discrete latent variables to tackle the inherent one-to-many mapping problem in response generation.

Task-specific Objectives of Pre-trained Language Models for Dialogue Adaptation

A Dialogue-Adaptive Pre-training Objective (DAPO) based on some important qualities for assessing dialogues which are usually ignored by general LM pre-training objectives is designed and Experimental results show that models with DAPO surpass those with general LMPre- training objectives and other strong baselines on downstream DrNLP tasks.

Learning an Effective Context-Response Matching Model with Self-Supervised Tasks for Retrieval-based Dialogues

This paper proposes learning a context-response matching model with auxiliary self-supervised tasks designed for the dialogue data based on pre-trained language models and jointly train the PLM-based response selection model with these auxiliary tasks in a multi-task manner.

MuTual: A Dataset for Multi-Turn Dialogue Reasoning

MuTual is introduced, a novel dataset for Multi-Turn dialogue Reasoning, consisting of 8,860 manually annotated dialogues based on Chinese student English listening comprehension exams, which shows that there is ample room for improving reasoning ability.

DIALOGPT : Large-Scale Generative Pre-training for Conversational Response Generation

It is shown that conversational systems that leverage DialoGPT generate more relevant, contentful and context-consistent responses than strong baseline systems.

DREAM: A Challenge Data Set and Models for Dialogue-Based Reading Comprehension

Experimental results on the DREAM data set show the effectiveness of dialogue structure and general world knowledge, the first dialogue-based multiple-choice reading comprehension data set to focus on in-depth multi-turn multi-party dialogue understanding.

Modeling Multi-turn Conversation with Deep Utterance Aggregation

This paper formulate previous utterances into context using a proposed deep utterance aggregation model to form a fine-grained context representation, and shows the model outperforms the state-of-the-art methods on three multi-turn conversation benchmarks, including a newly introduced e-commerce dialogue corpus.

Speaker-Aware BERT for Multi-Turn Response Selection in Retrieval-Based Chatbots

A new model, named Speaker-Aware BERT (SA-BERT), is proposed in order to make the model aware of the speaker change information, which is an important and intrinsic property of multi-turn dialogues and a speaker-aware disentanglement strategy is proposed to tackle the entangled dialogues.
...