MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems

  title={MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems},
  author={Zhaojiang Lin and Andrea Madotto and Genta Indra Winata and Pascale Fung},
In this paper, we propose Minimalist Transfer Learning (MinTL) to simplify the system design process of task-oriented dialogue systems and alleviate the over-dependency on annotated data. MinTL is a simple yet effective transfer learning framework, which allows us to plug-and-play pre-trained seq2seq models, and jointly learn dialogue state tracking and dialogue response generation. Unlike previous approaches, which use a copy mechanism to "carryover" the old dialogue states to the new one, we… 

Figures and Tables from this paper

Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System

This study presents PPTOD, a unified plug-and-play model for task-oriented dialogue, and introduces a new dialogue multi-task pre-training strategy that allows the model to learn the primary TOD task completion skills from heterogeneous dialog corpora.

Learning Knowledge Bases with Parameters for Task-Oriented Dialogue Systems

This paper proposes a method to embed the KB, of any size, directly into the model parameters, which does not require any DST or template responses, nor the KB as input, and it can dynamically update its KB via fine-tuning.

Pre-training and Data Augmentation for Dialogue State Tracking

The detailed experiments show that baseline models benefit immensely when pre-trained with span-level objectives in multiple phases and suggest that the use of data augmentation techniques such as paraphrasing also improve the performance on DST.

Schema Encoding for Transferable Dialogue State Tracking

This paper proposes Schema Encoding for Transferable Dialogue State Tracking (SET-DST), which is a neural DST method for effective transfer to new domains by encoding new schemas and using them for DST on multi-domain settings.

Dialogue Summaries as Dialogue States (DS2), Template-Guided Summarization for Few-shot Dialogue State Tracking

It is hypothesized that dialogue summaries are essentially unstructured dialogue states; hence, it is proposed to reformulate dialogue state tracking as a dialogue summarization problem, and the method DS2 outperforms previous works on few-shot DST in MultiWoZ 2.0 and 2.1.

DSTEA: Dialogue State Tracking with Entity Adaptive Pre-training

Although DSTEA conducts only pre-training without directly infusing additional knowledge to the DST model, it achieved better performance than the best-known benchmark models on MultiWOZ 2.0, 2.1, and 2.2.

A Comparative Study on Language Models for Task-Oriented Dialogue Systems

It is found that BART and T5 outperform GPT-based models in BLEU and F1 scores and achieve state-of-the-art performance in a ToD system.

DFM: Dialogue Foundation Model for Universal Large-Scale Dialogue-Oriented Task Learning

Experiments show that, compared with models of the same size, DFM can achieve state-of-the-art or competitive performance on very rich cross-domain downstream dialogue tasks, and demonstrates that DFM largely ex-tends the ability of unified dialogue pre-trained model.

Continual Learning in Task-Oriented Dialogue Systems

A first-ever continual learning benchmark for task-oriented dialogue systems with 37 domains to be learned continuously in both modularized and end-to-end learning settings is proposed and a simple yet effective architectural method based on residual adapters is proposed.

Zero-Shot Dialogue State Tracking via Cross-Task Transfer

This work proposes TransferQA, a transferable generative QA model that seamlessly combines extractive QA and multi-choice QA via a text-to-text transformer framework, and tracks both categorical slots and non-categorical slots in DST.



Hello, It’s GPT-2 - How Can I Help You? Towards the Use of Pretrained Language Models for Task-Oriented Dialogue Systems

This paper proposes a task-oriented dialogue model that operates solely on text input: it effectively bypasses explicit policy and language generation modules and holds promise to mitigate the data scarcity problem, and to support the construction of more engaging and more eloquent task- oriented conversational agents.

A Simple Language Model for Task-Oriented Dialogue

SimpleTOD is a simple approach to task-oriented dialogue that uses a single causal language model trained on all sub-tasks recast as a single sequence prediction problem, which allows it to fully leverage transfer learning from pre-trained, open domain, causal language models such as GPT-2.

Sequicity: Simplifying Task-oriented Dialogue Systems with Single Sequence-to-Sequence Architectures

A novel, holistic, extendable framework based on a single sequence-to-sequence (seq2seq) model which can be optimized with supervised or reinforcement learning is proposed which significantly outperforms state- of-the-art pipeline-based methods on large datasets and retains a satisfactory entity match rate on out-of-vocabulary (OOV) cases where pipeline-designed competitors totally fail.

TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue

The experimental results show that the pre-trained task- oriented dialogue BERT (ToD-BERT) surpasses BERT and other strong baselines in four downstream task-oriented dialogue applications, including intention detection, dialogue state tracking, dialogue act prediction, and response selection.

Teacher-Student Framework Enhanced Multi-domain Dialogue Generation

Experiments show that the dialogue system trained under the teacher-student framework outperforms the one uses a belief tracker and also benefits from human-labeled semantic data.

TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents

A new approach to generative data-driven dialogue systems (e.g. chatbots) called TransferTransfo is introduced which is a combination of a Transfer learning based training scheme and a high-capacity Transformer model which shows strong improvements over the current state-of-the-art end-to-end conversational models.

Scalable and Accurate Dialogue State Tracking via Hierarchical Sequence Generation

This paper investigates how to approach DST using a generation framework without the pre-defined ontology list, where each turn of user utterance and system response is directly generated by applying a hierarchical encoder-decoder structure.

Efficient Dialogue State Tracking by Selectively Overwriting Memory

The accuracy gaps between the current and the ground truth-given situations are analyzed and it is suggested that it is a promising direction to improve state operation prediction to boost the DST performance.

Non-Autoregressive Dialog State Tracking

A novel framework of Non-Autoregressive Dialog State Tracking (NADST) which can factor in potential dependencies among domains and slots to optimize the models towards better prediction of dialogue states as a complete set rather than separate slots is proposed.

SOLOIST: Few-shot Task-Oriented Dialog with A Single Pre-trained Auto-regressive Model

A new method SOLOIST is presented, which uses transfer learning to efficiently build task-oriented dialog systems at scale using a Transformer-based auto-regressive language model, which subsumes different dialog modules into a single neural model.