• Corpus ID: 233004444

MultiWOZ 2.4: A Multi-Domain Task-Oriented Dialogue Dataset with Essential Annotation Corrections to Improve State Tracking Evaluation

@article{Ye2021MultiWOZ2A,
  title={MultiWOZ 2.4: A Multi-Domain Task-Oriented Dialogue Dataset with Essential Annotation Corrections to Improve State Tracking Evaluation},
  author={Fanghua Ye and Jarana Manotumruksa and Emine Yilmaz},
  journal={ArXiv},
  year={2021},
  volume={abs/2104.00773}
}
The MultiWOZ 2.0 dataset was released in 001 2018. It consists of more than 10,000 task- 002 oriented dialogues spanning 7 domains, and 003 has greatly stimulated the research of task- 004 oriented dialogue systems. However, there is 005 substantial noise in the state annotations, which 006 hinders a proper evaluation of dialogue state 007 tracking models. To tackle this issue, mas- 008 sive efforts have been devoted to correcting 009 the annotations, resulting in three improved 010 versions of… 

Figures and Tables from this paper

ASSIST: Towards Label Noise-Robust Dialogue State Tracking
TLDR
This paper proposes a general framework, named ASSIST (lAbel noiSe-robuSt dIalogue State Tracking), to train DST models robustly from noisy labels, and shows the validity of ASSIST theoretically.
In-Context Learning for Few-Shot Dialogue State Tracking
TLDR
This work proposes an in-context (IC) learning framework for zero-shot and few-shot learning dialogue state tracking (DST), where a large pretrained language model (LM) takes a test instance and a few exemplars as input, and directly decodes the dialogue state without any parameter updates.
What Did You Say? Task-Oriented Dialog Datasets Are Not Conversational!?
TLDR
This work outlines a taxonomy of conversational and contextual effects, which is used to examine MULTIWOZ, SGD and SMCALFLOW, among the most recent and widely used task-oriented dialog datasets, and outlines desiderata for truly conversational dialog datasets.
Dialogue Summaries as Dialogue States (DS2), Template-Guided Summarization for Few-shot Dialogue State Tracking
TLDR
It is hypothesized that dialogue summaries are essentially unstructured dialogue states; hence, it is proposed to reformulate dialogue state tracking as a dialogue summarization problem, and the method DS2 outperforms previous works on few-shot DST in MultiWoZ 2.0 and 2.1.
Description-Driven Task-Oriented Dialog Modeling
TLDR
This paper proposes that schemata should be modified by replacing names or notations entirely with natural language descriptions, and shows that a language description-driven system exhibits better understanding of task specifications, higher performance on state tracking, improved data efficiency, and effective zero-shot transfer to unseen tasks.
A Chit-Chats Enhanced Task-Oriented Dialogue Corpora for Fuse-Motive Conversation Systems
TLDR
This work releases a multi-turn dialogues dataset called Chinese ChatEnhanced-Task (CCET) and proposes a line of fuse-motive dialogues formalization approach, along with several evaluation metrics for TOD sessions that are integrated by CC utterances.
ViWOZ: A Multi-Domain Task-Oriented Dialogue Systems Dataset For Low-resource Language
TLDR
ViWOZ is the first multi-turn, multi-domain tasked oriented dataset in Vietnamese, a low-resource language, and provides a comprehensive benchmark of both modular and end-to-end models in lowresource language scenarios.
Robust Dialogue State Tracking with Weak Supervision and Sparse Data
TLDR
A training strategy to build extractive DST models without the need for fine-grained manual span labels, and a new model architecture with a unified encoder that supports value as well as slot independence by leveraging the attention mechanism.
A Few-Shot Semantic Parser for Wizard-of-Oz Dialogues with the Precise ThingTalk Representation
TLDR
A new dialogue representation and a sample-efficient methodology that can predict precise dialogue states in WOZ conversations are proposed and extended the ThingTalk representation to capture all information an agent needs to respond properly.
Database Search Results Disambiguation for Task-Oriented Dialog Systems
TLDR
Training on augmented dialog data improves the model’s ability to deal with ambiguous scenarios, without sacrificing performance on unmodified turns, and helps the model to improve performance on DSRdisambiguation even in the absence of indomain data, suggesting that it can be learned as a universal dialog skill.
...
...

References

SHOWING 1-10 OF 34 REFERENCES
MultiWOZ 2.3: A multi-domain task-oriented dataset enhanced with annotation corrections and co-reference annotation
TLDR
This paper introduces MultiWOZ 2.3, in which it differentiate incorrect annotations in dialogue acts from dialogue states, and identifies a lack of co-reference when publishing the updated dataset, to ensure consistency between dialogue acts and dialogue states.
MultiWOZ 2.1: A Consolidated Multi-Domain Dialogue Dataset with State Corrections and State Tracking Baselines
TLDR
This work uses crowdsourced workers to re-annotate state and utterances based on the original utterances in the dataset, and benchmark a number of state-of-the-art dialogue state tracking models on the MultiWOZ 2.1 dataset and show the joint state tracking performance on the corrected state annotations.
MultiWOZ 2.2 : A Dialogue Dataset with Additional Annotation Corrections and State Tracking Baselines
TLDR
This work identifies and fixes dialogue state annotation errors across 17.3% of the utterances on top of MultiWOZ 2.1, and redefines the ontology by disallowing vocabularies of slots with a large number of possible values to help avoid annotation errors.
MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling
TLDR
The Multi-Domain Wizard-of-Oz dataset (MultiWOZ), a fully-labeled collection of human-human written conversations spanning over multiple domains and topics is introduced, at a size of 10k dialogues, at least one order of magnitude larger than all previous annotated task-oriented corpora.
RiSAWOZ: A Large-Scale Multi-Domain Wizard-of-Oz Dataset with Rich Semantic Annotations for Task-Oriented Dialogue Modeling
TLDR
RiSAWOZ is a large-scale multi-domain Chinese Wizard-of-Oz dataset with Rich Semantic Annotations, which contains 11.2K human-to-human multi-turn semantically annotated dialogues, with more than 150K utterances spanning over 12 domains, which is larger than all previous annotated H2H conversational datasets.
CrossWOZ: A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset
TLDR
The large size and rich annotation of CrossWOZ make it suitable to investigate a variety of tasks in cross-domain dialogue modeling, such as dialogue state tracking, policy learning, user simulation, etc.
Transferable Multi-Domain State Generator for Task-Oriented Dialogue Systems
TLDR
A Transferable Dialogue State Generator (TRADE) that generates dialogue states from utterances using copy mechanism, facilitating transfer when predicting (domain, slot, value) triplets not encountered during training.
Slot Self-Attentive Dialogue State Tracking
TLDR
This paper proposes a slot self-attention mechanism that can learn the slot correlations automatically, and achieves state-of-the-art performance on both datasets, verifying the necessity and effectiveness of taking slot correlations into consideration.
Towards Scalable Multi-domain Conversational Agents: The Schema-Guided Dialogue Dataset
TLDR
This work introduces the the Schema-Guided Dialogue (SGD) dataset, containing over 16k multi-domain conversations spanning 16 domains, and presents a schema-guided paradigm for task-oriented dialogue, in which predictions are made over a dynamic set of intents and slots provided as input.
Slot Attention with Value Normalization for Multi-domain Dialogue State Tracking
TLDR
A new architecture to cleverly exploit ontology, which consists of Slot Attention (SA) and Value Normalization (VN), referred to as SAVN, is proposed, which achieves the state-of-the-art joint accuracy and evaluation results show that even if only 30% ontology is used, VN can also contribute to the model.
...
...