Corpus ID: 231801989

Converse, Focus and Guess - Towards Multi-Document Driven Dialogue

@inproceedings{Liu2021ConverseFA,
  title={Converse, Focus and Guess - Towards Multi-Document Driven Dialogue},
  author={Han Liu and Caixia Yuan and Xiaojie Wang and Yushu Yang and Huixing Jiang and Zhongyuan Wang},
  booktitle={AAAI},
  year={2021}
}
We propose a novel task, Multi-Document Driven Dialogue (MD3), in which an agent can guess the target document that the user is interested in by leading a dialogue. To benchmark progress, we introduce a new dataset of GuessMovie, which contains 16,881 documents, each describing a movie, and associated 13,434 dialogues. Further, we propose the MD3 model. Keeping guessing the target document in mind, it converses with the user conditioned on both document engagement and user feedback. In order to… Expand

Figures and Tables from this paper

References

SHOWING 1-10 OF 26 REFERENCES
End-to-End Reinforcement Learning of Dialogue Agents for Information Access
This paper proposes KB-InfoBot -- a multi-turn dialogue agent which helps users search Knowledge Bases (KBs) without composing complicated queries. Such goal-oriented dialogue agents typically needExpand
A User Simulator for Task-Completion Dialogues
TLDR
A new, publicly available simulation framework, where the simulator, designed for the movie-booking domain, leverages both rules and collected data, and several agents are demonstrated and the procedure to add and test your own agent is detailed. Expand
Global-to-local Memory Pointer Networks for Task-Oriented Dialogue
TLDR
The proposed global-to-local memory pointer networks can improve copy accuracy and mitigate the common out-of-vocabulary problem, and is able to improve over the previous state- of-the-art models in both simulated bAbI Dialogue dataset and human-human Stanford Multi-domain Dialogue dataset on automatic and human evaluation. Expand
Wizard of Wikipedia: Knowledge-Powered Conversational agents
TLDR
The best performing dialogue models are able to conduct knowledgeable discussions on open-domain topics as evaluated by automatic metrics and human evaluations, while a new benchmark allows for measuring further improvements in this important research direction. Expand
Guessing State Tracking for Visual Dialogue
TLDR
A guessing state tracking based guess model for the Guesser, which significantly outperforms previous models, achieves new state-of-the-art, and especially the success rate of guessing 83.3% is approaching the human-level accuracy of 84.4%. Expand
Visual Dialogue State Tracking for Question Generation
TLDR
This paper proposes visual dialogue state tracking (VDST) based method for question generation that significantly outperforms existing methods and achieves new state-of-the-art performance on GuessWhat?! dataset. Expand
Incremental Transformer with Deliberation Decoder for Document Grounded Conversations
TLDR
This paper designs an Incremental Transformer to encode multi-turn utterances along with knowledge in related documents and designs a two-pass decoder (Deliberation Decoder) to improve context coherence and knowledge correctness. Expand
CoQA: A Conversational Question Answering Challenge
TLDR
CoQA is introduced, a novel dataset for building Conversational Question Answering systems and it is shown that conversational questions have challenging phenomena not present in existing reading comprehension datasets (e.g., coreference and pragmatic reasoning). Expand
Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning
TLDR
This work poses a cooperative ‘image guessing’ game between two agents who communicate in natural language dialog so that Q-BOT can select an unseen image from a lineup of images and shows the emergence of grounded language and communication among ‘visual’ dialog agents with no human supervision. Expand
GuessWhat?! Visual Object Discovery through Multi-modal Dialogue
We introduce GuessWhat?!, a two-player guessing game as a testbed for research on the interplay of computer vision and dialogue systems. The goal of the game is to locate an unknown object in a richExpand
...
1
2
3
...