• Corpus ID: 23694187

End-to-End Optimization of Task-Oriented Dialogue Model with Deep Reinforcement Learning

@article{Liu2017EndtoEndOO,
  title={End-to-End Optimization of Task-Oriented Dialogue Model with Deep Reinforcement Learning},
  author={Bing Liu and G{\"o}khan T{\"u}r and Dilek Z. Hakkani-T{\"u}r and Pararth Shah and Larry Heck},
  journal={ArXiv},
  year={2017},
  volume={abs/1711.10712}
}
In this paper, we present a neural network based task-oriented dialogue system that can be optimized end-to-end with deep reinforcement learning (RL). The system is able to track dialogue state, interface with knowledge bases, and incorporate query results into agent's responses to successfully complete task-oriented dialogues. Dialogue policy learning is conducted with a hybrid supervised and deep RL methods. We first train the dialogue agent in a supervised manner by learning directly from… 

Figures and Tables from this paper

Dialogue Learning with Human Teaching and Feedback in End-to-End Trainable Task-Oriented Dialogue Systems
TLDR
Experimental results show that the end-to-end dialogue agent can learn effectively from the mistake it makes via imitation learning from user teaching, and applying reinforcement learning with user feedback after the imitation learning stage further improves the agent’s capability in successfully completing a task.
End-to-End Learning of Task-Oriented Dialogs
TLDR
This thesis proposal designs neural network based dialog system that is able to robustly track dialog state, interface with knowledge bases, and incorporate structured query results into system responses to successfully complete task-oriented dialog.
User Modeling for Task Oriented Dialogues
TLDR
This work designs a hierarchical sequence-to-sequence model that first encodes the initial user goal and system turns into fixed length representations using Recurrent Neural Networks (RNN), and develops several variants by utilizing a latent variable model to inject random variations into user responses to promote diversity in simulated user responses.
End-to-End latent-variable task-oriented dialogue system with exact log-likelihood optimization
TLDR
An end-to-end dialogue model based on a hierarchical encoder-decoder, which employed a discrete latent variable to learn underlying dialogue intentions, which argues that the latent discrete variable interprets the intentions that guide machine responses generation.
Transferable Dialogue Systems and User Simulators
TLDR
The goal is to develop a modelling framework that can incorporate new dialogue scenarios through self-play between the two agents that is highly effective in bootstrapping the performance of the two agent in transfer learning.
Policy Adaptation for Deep Reinforcement Learning-Based Dialogue Management
TLDR
Simulation experiments showed that MADP can significantly speed up the policy learning and facilitate policy adaptation.
Integrating planning for task-completion dialogue policy learning
TLDR
This paper addresses challenges of training a task-completion dialogue agent with real users via reinforcement learning by integrating planning into the dialogue policy learning based on Dyna-Q framework, and provides a more sample-efficient approach to learn the dialogue polices.
Bootstrapping a Neural Conversational Agent with Dialogue Self-Play, Crowdsourcing and On-Line Reinforcement Learning
TLDR
This paper discusses the advantages of this approach for industry applications of conversational agents, wherein an agent can be rapidly bootstrapped to deploy in front of users and further optimized via interactive learning from actual users of the system.
Learning Knowledge Bases with Parameters for Task-Oriented Dialogue Systems
TLDR
This paper proposes a method to embed the KB, of any size, directly into the model parameters, which does not require any DST or template responses, nor the KB as input, and it can dynamically update its KB via fine-tuning.
Cas-GANs: An Approach of Dialogue Policy Learning based on GAN and RL Techniques
TLDR
A new technique called Cascade Generative Adversarial Network (Cas-GAN) is proposed that is combination of the GAN and RL for dialog generation that aims to improve the quality of fluency and diversity for generated dialogues.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 25 REFERENCES
Iterative policy learning in end-to-end trainable task-oriented neural dialog models
  • Bing Liu, I. Lane
  • Computer Science
    2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
  • 2017
TLDR
A deep reinforcement learning (RL) framework for iterative dialog policy optimization in end-to-end task-oriented dialog systems by jointly optimizing the dialog agent and the user simulator with deep RL by simulating dialogs between the two agents.
End-to-End Task-Completion Neural Dialogue Systems
TLDR
The end-to-end system not only outperforms modularized dialogue system baselines for both objective and subjective evaluation, but also is robust to noises as demonstrated by several systematic experiments with different error granularity and rates specific to the language understanding module.
End-to-End Reinforcement Learning of Dialogue Agents for Information Access
This paper proposes KB-InfoBot -- a multi-turn dialogue agent which helps users search Knowledge Bases (KBs) without composing complicated queries. Such goal-oriented dialogue agents typically need
Deep Reinforcement Learning for Dialogue Generation
TLDR
This work simulates dialogues between two virtual agents, using policy gradient methods to reward sequences that display three useful conversational properties: informativity, non-repetitive turns, coherence, and ease of answering.
A Network-based End-to-End Trainable Task-oriented Dialogue System
TLDR
This work introduces a neural network-based text-in, text-out end-to-end trainable goal-oriented dialogue system along with a new way of collecting dialogue data based on a novel pipe-lined Wizard-of-Oz framework that can converse with human subjects naturally whilst helping them to accomplish tasks in a restaurant search domain.
A Copy-Augmented Sequence-to-Sequence Architecture Gives Good Performance on Task-Oriented Dialogue
TLDR
This model outperforms more complex memory-augmented models by 7% in per-response generation and is on par with the current state-of-the-art on DSTC2, a real-world task-oriented dialogue dataset.
Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models
TLDR
The recently proposed hierarchical recurrent encoder-decoder neural network is extended to the dialogue domain, and it is demonstrated that this model is competitive with state-of-the-art neural language models and back-off n-gram models.
An End-to-End Trainable Neural Network Model with Belief Tracking for Task-Oriented Dialog
TLDR
This work presents a novel end-to-end trainable neural network model that is able to track dialog state, issue API calls to knowledge base (KB), and incorporate structured KB query results into system responses to successfully complete task-oriented dialogs.
On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems
TLDR
An on-line learning framework whereby the dialogue policy is jointly trained alongside the reward model via active learning with a Gaussian process model is proposed.
Learning End-to-End Goal-Oriented Dialog
TLDR
It is shown that an end-to-end dialog system based on Memory Networks can reach promising, yet imperfect, performance and learn to perform non-trivial operations and be compared to a hand-crafted slot-filling baseline on data from the second Dialog State Tracking Challenge.
...
1
2
3
...