• Corpus ID: 221094992

Continue or SHIFT: Learning Conversational Patterns for Dialogue Generation

  title={Continue or SHIFT: Learning Conversational Patterns for Dialogue Generation},
  author={Shaoxiong Feng and Xuancheng Ren and Kan Li and Xu Sun},
In dialogues, it is often the case that the response could be either relevant or irrelevant to the given conversation context, depending on the speaker’s intention of either topic continuity or topic shift. However, this aspect of dialogues is less explored in existing generative dialogue systems, because the widely-used encoder-decoder-based attention models are built upon the assumption that the target sequence is andmust be relevant to the source sequence. In this work, we propose the loose… 

Figures and Tables from this paper



ReCoSa: Detecting the Relevant Contexts with Self-Attention for Multi-turn Dialogue Generation

Experimental results on both Chinese customer services dataset and English Ubuntu dialogue dataset show that ReCoSa significantly outperforms baseline models, in terms of both metric-based and human evaluations.

Deep Reinforcement Learning for Dialogue Generation

This work simulates dialogues between two virtual agents, using policy gradient methods to reward sequences that display three useful conversational properties: informativity, non-repetitive turns, coherence, and ease of answering.

Reinforcing Coherence for Sequence to Sequence Model in Dialogue Generation

Three different types of coherence models, including an unlearned similarity function, a pretrained semantic matching function, and an end-to-end dual learning architecture, are proposed in this paper, showing that the proposed models produce more specific and meaningful responses, yielding better performances against Seq2Seq models in terms of both metric-based and human evaluations.

Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models

The recently proposed hierarchical recurrent encoder-decoder neural network is extended to the dialogue domain, and it is demonstrated that this model is competitive with state-of-the-art neural language models and back-off n-gram models.

Incorporating loose-structured knowledge into conversation modeling via recall-gate LSTM

A deep neural network is proposed to incorporate background knowledge as background knowledge for conversation modeling through a recall mechanism with a specially designed recall-gate, so as to enrich the ability of LSTM to capture the implicit semantic clues in conversations.

A Diversity-Promoting Objective Function for Neural Conversation Models

This work proposes using Maximum Mutual Information (MMI) as the objective function in neural models, and demonstrates that the proposed MMI models produce more diverse, interesting, and appropriate responses, yielding substantive gains in BLEU scores on two conversational datasets and in human evaluations.

A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues

A neural network-based generative architecture, with stochastic latent variables that span a variable number of time steps, that improves upon recently proposed models and that the latent variables facilitate both the generation of meaningful, long and diverse responses and maintaining dialogue state is proposed.

Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders

This work presents a novel framework based on conditional variational autoencoders that capture the discourse-level diversity in the encoder and uses latent variables to learn a distribution over potential conversational intents and generates diverse responses using only greedy decoders.

Personalizing Dialogue Agents: I have a dog, do you have pets too?

This work collects data and train models tocondition on their given profile information; and information about the person they are talking to, resulting in improved dialogues, as measured by next utterance prediction.

Hierarchical Recurrent Attention Network for Response Generation

This work proposes a hierarchical recurrent attention network (HRAN) to model both the hierarchy and the importance variance in a unified framework and attends to important parts within and among utterances with word level attention and utterance level attention respectively.