Information Extraction and Human-Robot Dialogue towards Real-life Tasks: A Baseline Study with the MobileCS Dataset

  title={Information Extraction and Human-Robot Dialogue towards Real-life Tasks: A Baseline Study with the MobileCS Dataset},
  author={Hong Liu and Hao Peng and Zhijian Ou and Juan-Zi Li and Yi Huang and Junlan Feng},
Recently, there have merged a class of task-oriented dialogue (TOD) datasets collected through Wizard-of-Oz simulated games. How-ever, the Wizard-of-Oz data are in fact simulated data and thus are fundamentally different from real-life conversations, which are more noisy and casual. Recently, the SereTOD challenge is organized and releases the MobileCS dataset, which consists of real-world dialog transcripts between real users and customer-service staffs from China Mobile. Based on the MobileCS… 

Figures and Tables from this paper

Semi-Supervised Knowledge-Grounded Pre-training for Task-Oriented Dialog Systems

A knowledge-grounded dialog model is built to formulate dialog history and local KB as input and predict the system response and achieves the first place both in the automatic evaluation and human interaction.



MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling

The Multi-Domain Wizard-of-Oz dataset (MultiWOZ), a fully-labeled collection of human-human written conversations spanning over multiple domains and topics is introduced, at a size of 10k dialogues, at least one order of magnitude larger than all previous annotated task-oriented corpora.

Building a Conversational Agent Overnight with Dialogue Self-Play

A new corpus of 3,000 dialogues spanning 2 domains collected with M2M is proposed, and comparisons with popular dialogue datasets on the quality and diversity of the surface forms and dialogue flows are presented.

Dialogue-Based Relation Extraction

It is argued that speaker-related information plays a critical role in the proposed task, based on an analysis of similarities and differences between dialogue-based and traditional RE tasks, and a new metric to evaluate the performance of RE methods in a conversational setting is designed.

Frames: a corpus for adding memory to goal-oriented dialogue systems

A rule-based baseline is proposed and the frame tracking task is proposed, which consists of keeping track of different semantic frames throughout each dialogue, and the task is analysed through this baseline.

Semi-supervised Learning for Information Extraction from Dialogue

It is shown that the method presented improves the classification task in the case where only a small amount of labeled data is available, and several types of encoders are compared, both in the context of a classi-cation task and in a human-evaluation of their learned representations.

Towards Scalable Multi-domain Conversational Agents: The Schema-Guided Dialogue Dataset

This work introduces the the Schema-Guided Dialogue (SGD) dataset, containing over 16k multi-domain conversations spanning 16 domains, and presents a schema-guided paradigm for task-oriented dialogue, in which predictions are made over a dynamic set of intents and slots provided as input.

Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System

This study presents PPTOD, a unified plug-and-play model for task-oriented dialogue, and introduces a new dialogue multi-task pre-training strategy that allows the model to learn the primary TOD task completion skills from heterogeneous dialog corpora.

A Simple Language Model for Task-Oriented Dialogue

SimpleTOD is a simple approach to task-oriented dialogue that uses a single causal language model trained on all sub-tasks recast as a single sequence prediction problem, which allows it to fully leverage transfer learning from pre-trained, open domain, causal language models such as GPT-2.

Flexibly-Structured Model for Task-Oriented Dialogues

This architecture is scalable to real-world scenarios and is shown through an empirical evaluation to achieve state-of-the-art performance on both the Cambridge Restaurant dataset and the Stanford in-car assistant dataset.

CrossWOZ: A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset

The large size and rich annotation of CrossWOZ make it suitable to investigate a variety of tasks in cross-domain dialogue modeling, such as dialogue state tracking, policy learning, user simulation, etc.