CAiRE in DialDoc21: Data Augmentation for Information Seeking Dialogue System

  title={CAiRE in DialDoc21: Data Augmentation for Information Seeking Dialogue System},
  author={Etsuko Ishii and Yan Xu and Genta Indra Winata and Zhaojiang Lin and Andrea Madotto and Zihan Liu and Peng Xu and Pascale Fung},
Information-seeking dialogue systems, including knowledge identification and response generation, aim to respond to users with fluent, coherent, and informative responses based on users’ needs, which. To tackle this challenge, we utilize data augmentation methods and several training techniques with the pre-trained language models to learn a general pattern of the task and thus achieve promising performance. In DialDoc21 competition, our system achieved 74.95 F1 score and 60.74 Exact Match… 

Tables from this paper

Docalog: Multi-document Dialogue System using Transformer-based Span Retrieval

The proposed approach, Docalog, for the DialDoc-22 (MultiDoc2Dial) shared task is discussed, which is a three-stage pipeline consisting of a document retriever model, an answer span prediction model, and an ultimate span picker deciding on the most likely answer span, out of all predicted spans.

UniGDD: A Unified Generative Framework for Goal-Oriented Document-Grounded Dialogue

A prompt-connected multi-task learning strategy to model the characteristics and connections of different tasks and introduce linear temperature scheduling to reduce the negative effect of irrelevant document information is developed.

DialDoc 2021 Shared Task: Goal-Oriented Document-grounded Dialogue Modeling

The primary goal of this Shared Task is to build goal-oriented information-seeking conversation systems that can identify the most relevant knowledge in the associated document for generating agent responses in natural language.

Integrating Question Rewrites in Conversational Question Answering: A Reinforcement Learning Approach

A reinforcement learning approach that integrates QR and CQA tasks without corresponding labeled QR datasets is proposed and the experimental results show that this approach can bring improvement over the pipeline approaches.

Can Question Rewriting Help Conversational Question Answering?

A reinforcement learning approach is investigated that integrates QR and CQA tasks and does not require corresponding QR datasets for targeted CZA and finds that the RL method is on par with the end-to-end baseline.



Retrieval-Free Knowledge-Grounded Dialogue Response Generation with Adapters

This paper proposes KnowExpert, an end-to-end framework to bypass the explicit retrieval process and inject knowledge into the pre-trained language models with lightweight adapters and adapt to the knowledge-grounded dialogue task.

Doc2Dial: A Framework for Dialogue Composition Grounded in Documents

We introduce Doc2Dial, an end-to-end framework for generating conversational data grounded in given documents. It takes the documents as input and generates the pipelined tasks for obtaining the

Wizard of Wikipedia: Knowledge-Powered Conversational agents

The best performing dialogue models are able to conduct knowledgeable discussions on open-domain topics as evaluated by automatic metrics and human evaluations, while a new benchmark allows for measuring further improvements in this important research direction.

Learning Knowledge Bases with Parameters for Task-Oriented Dialogue Systems

This paper proposes a method to embed the KB, of any size, directly into the model parameters, which does not require any DST or template responses, nor the KB as input, and it can dynamically update its KB via fine-tuning.

DoQA - Accessing Domain-Specific FAQs via Conversational QA

The results of an existing, strong, system show that, thanks to transfer learning from a Wikipedia QA dataset and fine tuning on a single FAQ domain, it is possible to build high quality conversational QA systems for FAQs without in-domain training data.

Zero-shot Cross-lingual Dialogue Systems with Transferable Latent Variables

A zero-shot adaptation of task-oriented dialogue system to low-resource languages to cope with the variance of similar sentences across different languages, which is induced by imperfect cross-lingual alignments and inherent differences in languages is proposed.

Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue

The proposed sequential latent variable model can keep track of the prior and posterior distribution over knowledge and can not only reduce the ambiguity caused from the diversity in knowledge selection of conversation but also better leverage the response information for proper choice of knowledge.

CoQA: A Conversational Question Answering Challenge

CoQA is introduced, a novel dataset for building Conversational Question Answering systems and it is shown that conversational questions have challenging phenomena not present in existing reading comprehension datasets (e.g., coreference and pragmatic reasoning).

QuAC: Question Answering in Context

QuAC introduces challenges not found in existing machine comprehension datasets: its questions are often more open-ended, unanswerable, or only meaningful within the dialog context, as it shows in a detailed qualitative evaluation.

Technical report on Conversational Question Answering

This project proposes a new system RoBERTa + AT +KD, which involves rationale tagging multi-task, adversarial training, knowledge distillation and a linguistic post-process strategy, and their single model achieves 90.4(F1) on the CoQA test set without data augmentation.