Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems

@article{Chen2021ActionBasedCD,
  title={Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems},
  author={Derek Chen and Howard Chen and Yi Yang and Alex Tong Lin and Zhou Yu},
  journal={ArXiv},
  year={2021},
  volume={abs/2104.00783}
}
Existing goal-oriented dialogue datasets focus mainly on identifying slots and values. However, customer support interactions in reality often involve agents following multi-step procedures derived from explicitly-defined company policies as well. To study customer service dialogue systems in more realistic settings, we introduce the Action-Based Conversations Dataset (ABCD), a fully-labeled dataset with over 10K human-to-human dialogues containing 55 distinct user intents requiring unique… 

Figures and Tables from this paper

DG2: Data Augmentation Through Document Grounded Dialogue Generation
TLDR
An automatic data augmentation technique grounded on documents through a generative dialogue model that consists of a user bot and agent bot that can synthesize diverse dialogues given an input document which is then used to train a downstream model.
Unsupervised Learning of Hierarchical Conversation Structure
TLDR
This work introduces an unsupervised approach to learning hierarchical conversation structure, including turn and sub-dialogue segment labels, corresponding roughly to dialogue acts and sub/sub-tasks, re-spectively, and is shown to be useful in enhancing neural models of language for three conversation-level understanding tasks.
Workflow Discovery from Dialogues in the Low Data Regime
TLDR
This work proposes and evaluates an approach that conditions models on the set of allowable action steps and shows that using this strategy it can improve workflow discovery (WD) performance and improves zero-shot and few-shot WD performance when transferring learned models to entirely new domains.
Long-term Control for Dialogue Generation: Methods and Evaluation
TLDR
This work identifies gaps in current methods for evaluation, proposes new metrics that better assess dialogue control relative to current alternatives, and proposes a retrieval-augmented method that improves performance of long-term controlled generation via logit modification techniques.
Learning as Conversation: Dialogue Systems Reinforced for Information Acquisition
We propose novel AI-empowered chat bots for learning as conversation where a user does not read a passage but gains information and knowledge through conversation with a teacher bot. Our
BiToD: A Bilingual Multi-Domain Dataset For Task-Oriented Dialogue Modeling
TLDR
BiToD2 is introduced, the first bilingual multi-domain dataset for end-to-end task-oriented dialogue modeling and provides state-of-the-art baselines under three evaluation settings (monolingual, bilingual, and cross-lingual).
GALAXY: A Generative Pre-trained Model for Task-Oriented Dialog with Semi-Supervised Learning and Explicit Policy Injection
TLDR
GALAXY is a novel pre-trained dialog model that explicitly learns dialog policy from limited labeled dialogs and large-scale unlabeled dialog corpora via semi-supervised learning and has a stronger few-shot ability than existing models under various low-resource settings.
Where to Go for the Holidays: Towards Mixed-Type Dialogs for Clarification of User Goals
TLDR
This paper proposes a mixed-type dialog model with a novel Prompt-based continual learning mechanism that enables the model to continually strengthen its ability on any specific type by utilizing existing dialog corpora effectively.
The Task2Dial Dataset: A Novel Dataset for Commonsense-enhanced Task-based Dialogue Grounded in Documents
TLDR
The Task2Dial dataset is described, a novel dataset of document-grounded task-based dialogues, where an Information Giver provides instructions (by consulting a document) to an Information Follower, so that the latter can successfully complete the task.
Task2Dial: A Novel Task and Dataset for Commonsense-enhanced Task-based Dialogue Grounded in Documents
TLDR
The Task2Dial dataset is described, a novel dataset of document-grounded task-based dialogues, where an Information Giver provides instructions (by consulting a document) to an Information Follower, so that the latter can successfully complete the task.
...
...

References

SHOWING 1-10 OF 48 REFERENCES
Towards Scalable Multi-domain Conversational Agents: The Schema-Guided Dialogue Dataset
TLDR
This work introduces the the Schema-Guided Dialogue (SGD) dataset, containing over 16k multi-domain conversations spanning 16 domains, and presents a schema-guided paradigm for task-oriented dialogue, in which predictions are made over a dynamic set of intents and slots provided as input.
A Simple Language Model for Task-Oriented Dialogue
TLDR
SimpleTOD is a simple approach to task-oriented dialogue that uses a single causal language model trained on all sub-tasks recast as a single sequence prediction problem, which allows it to fully leverage transfer learning from pre-trained, open domain, causal language models such as GPT-2.
Transferable Multi-Domain State Generator for Task-Oriented Dialogue Systems
TLDR
A Transferable Dialogue State Generator (TRADE) that generates dialogue states from utterances using copy mechanism, facilitating transfer when predicting (domain, slot, value) triplets not encountered during training.
Building a Conversational Agent Overnight with Dialogue Self-Play
TLDR
A new corpus of 3,000 dialogues spanning 2 domains collected with M2M is proposed, and comparisons with popular dialogue datasets on the quality and diversity of the surface forms and dialogue flows are presented.
Training Neural Response Selection for Task-Oriented Dialogue Systems
TLDR
A novel method which pretrains the response selection model on large general-domain conversational corpora and fine-tunes the pretrained model for the target dialogue domain, relying only on the small in-domain dataset to capture the nuances of the given dialogue domain is proposed.
Multi-Domain Goal-Oriented Dialogues (MultiDoGO): Strategies toward Curating and Annotating Large Scale Dialogue Data
TLDR
The MultiDoGO dataset is introduced, which is over 8 times the size of MultiWOZ, the other largest comparable dialogue dataset currently available to the public, and adopted a Wizard-of-Oz approach wherein a crowd-sourced worker is paired with a trained annotator.
Key-Value Retrieval Networks for Task-Oriented Dialogue
TLDR
This work proposes a new neural dialogue agent that is able to effectively sustain grounded, multi-domain discourse through a novel key-value retrieval mechanism and significantly outperforms a competitive rule-based system and other existing neural dialogue architectures on the provided domains according to both automatic and human evaluation metrics.
Taskmaster-1: Toward a Realistic and Diverse Dialog Dataset
TLDR
This work introduces the initial release of the Taskmaster-1 dataset which includes 13,215 task-based dialogs comprising six domains and offers several baseline models including state of the art neural seq2seq architectures with benchmark performance as well as qualitative human evaluations.
Schema-Guided Dialogue State Tracking Task at DSTC8
TLDR
The goal of this task is to develop dialogue state tracking models suitable for large-scale virtual assistants, with a focus on data-efficient joint modeling across domains and zero-shot generalization to new APIs.
End-to-End Neural Pipeline for Goal-Oriented Dialogue Systems using GPT-2
TLDR
This paper presents an end-to-end neural architecture for dialogue systems that addresses both challenges above and achieves the success rate, language understanding, and response appropriateness in the 8th dialogue systems technology challenge (DSTC8).
...
...