• Corpus ID: 231718890

An Empirical Study of Cross-Lingual Transferability in Generative Dialogue State Tracker

  title={An Empirical Study of Cross-Lingual Transferability in Generative Dialogue State Tracker},
  author={Yen-Ting Lin and Yun-Nung Chen},
There has been a rapid development in data-driven taskoriented dialogue systems with the benefit of large-scale datasets. However, the progress of dialogue systems in lowresource languages lags far behind due to the lack of highquality data. To advance the cross-lingual technology in building dialog systems, DSTC9 introduces the task of crosslingual dialog state tracking, where we test the DST module in a low-resource language given the rich-resource training dataset. This paper studies the… 

Figures and Tables from this paper

Cross-lingual Intermediate Fine-tuning improves Dialogue State Tracking

This work enhances the transfer learning process by intermediate fine-tuning of pretrained mult bilingual models, where the multilingual models arefine-tuned with different but related data and/or tasks.



Cross-lingual Transfer Learning for Multilingual Task Oriented Dialog

This paper presents a new data set of 57k annotated utterances in English, Spanish, Spanish and Thai and uses this data set to evaluate three different cross-lingual transfer methods, finding that given several hundred training examples in the the target language, the latter two methods outperform translating the training data.

Attention-Informed Mixed-Language Training for Zero-shot Cross-lingual Task-oriented Dialogue Systems

Attention-Informed Mixed-Language Training (MLT) is introduced, a novel zero-shot adaptation method for cross-lingual task-oriented dialogue systems that leverages very few task-related parallel word pairs to generate code-switching sentences for learning the inter-lingUAL semantics across languages.

Transferable Multi-Domain State Generator for Task-Oriented Dialogue Systems

A Transferable Dialogue State Generator (TRADE) that generates dialogue states from utterances using copy mechanism, facilitating transfer when predicting (domain, slot, value) triplets not encountered during training.

MultiWOZ 2.1: A Consolidated Multi-Domain Dialogue Dataset with State Corrections and State Tracking Baselines

This work uses crowdsourced workers to re-annotate state and utterances based on the original utterances in the dataset, and benchmark a number of state-of-the-art dialogue state tracking models on the MultiWOZ 2.1 dataset and show the joint state tracking performance on the corrected state annotations.

Neural Belief Tracker: Data-Driven Dialogue State Tracking

This work proposes a novel Neural Belief Tracking (NBT) framework which overcomes past limitations, matching the performance of state-of-the-art models which rely on hand-crafted semantic lexicons and outperforming them when such lexicons are not provided.

A Contextual Hierarchical Attention Network with Adaptive Objective for Dialogue State Tracking

This paper proposes to enhance the DST through employing a contextual hierarchical attention network to not only discern relevant information at both word level and turn level but also learn contextual representations, and proposes an adaptive objective to alleviate the slot imbalance problem by dynamically adjust weights of different slots during training.

Efficient Dialogue State Tracking by Selectively Overwriting Memory

The accuracy gaps between the current and the ground truth-given situations are analyzed and it is suggested that it is a promising direction to improve state operation prediction to boost the DST performance.

Large-Scale Multi-Domain Belief Tracking with Knowledge Sharing

A novel approach is introduced that fully utilizes semantic similarity between dialogue utterances and the ontology terms, allowing the information to be shared across domains, and demonstrates great capability in handling multi-domain dialogues.

Semantic Specialization of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints

The evaluation shows that the Attract-Repel method can make use of existing cross-lingual lexicons to construct high-quality vector spaces for a plethora of different languages, facilitating semantic transfer from high- to lower-resource ones.

A Neural Conversational Model

A simple approach to conversational modeling which uses the recently proposed sequence to sequence framework, and is able to extract knowledge from both a domain specific dataset, and from a large, noisy, and general domain dataset of movie subtitles.