Scaling up deep reinforcement learning for multi-domain dialogue systems

@article{Cuayhuitl2017ScalingUD,
  title={Scaling up deep reinforcement learning for multi-domain dialogue systems},
  author={Heriberto Cuay{\'a}huitl and Seunghak Yu and Ashley Williamson and Jacob Carse},
  journal={2017 International Joint Conference on Neural Networks (IJCNN)},
  year={2017},
  pages={3339-3346}
}
Standard deep reinforcement learning methods such as Deep Q-Networks (DQN) for multiple tasks (domains) face scalability problems due to large search spaces. This paper proposes a three-stage method for multi-domain dialogue policy learning-termed NDQN, and applies it to an information-seeking spoken dialogue system in the domains of restaurants and hotels. In this method, the first stage does multi-policy learning via a network of DQN agents; the second makes use of compact state… Expand
Transfer Learning based Task-oriented Dialogue Policy for Multiple Domains using Hierarchical Reinforcement Learning
TLDR
This paper presents a multi-domain, multi-intent based task-oriented dialogue system by successfully combining Hierarchical Deep Reinforcement Learning and Transfer Learning paradigms, and reduces the data requirement to train multi- domain VAs by atleast 20% for distant domains and almost 38% for close domains. Expand
Deep Reinforcement Learning of Dialogue Policies with Less Weight Updates
TLDR
A two-stage method for accelerating the induction of single or multi-domain dialogue policies through less weight updates in both stages that is useful for training larger-scale neural-based spoken dialogue systems. Expand
Feudal Reinforcement Learning for Dialogue Management in Large Domains
TLDR
A novel Dialogue Management architecture, based on Feudal RL, is proposed, which decomposes the decision into two steps; a first step where a master policy selects a subset of primitive actions, and a secondstep where a primitive action is chosen from the selected subset. Expand
Sub-domain Modelling for Dialogue Management with Hierarchical Reinforcement Learning
TLDR
A new method for hierarchical reinforcement learning using the option framework is proposed and it is shown that the proposed architecture learns faster and arrives at a better policy than the existing flat ones do. Expand
Composite Task-Completion Dialogue System via Hierarchical Deep Reinforcement Learning
TLDR
This paper addresses the travel planning task by formulating the task in the mathematical framework of options over Markov Decision Processes (MDPs), and proposing a hierarchical deep reinforcement learning approach to learning a dialogue manager that operates at different temporal scales. Expand
A hierarchical approach for efficient multi-intent dialogue policy learning
TLDR
The proposed hierarchical method for learning an efficient Dialogue Management (DM) strategy for task-oriented conversations serving multiple intents of a domain attains an improvement of 41% in terms of dialogue length as compared to a single-intent based system serving the same 5-intents. Expand
Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning
TLDR
This paper addresses the travel planning task by formulating the task in the mathematical framework of options over Markov Decision Processes (MDPs), and proposing a hierarchical deep reinforcement learning approach to learning a dialogue manager that operates at different temporal scales. Expand
Ensemble-Based Deep Reinforcement Learning for Chatbots
TLDR
A novel ensemble-based approach applied to value-based DRL chatbots, which use finite action sets as a form of meaning representation, which shows that near human-like dialogue policies can be induced and generalisation to unseen data is a difficult problem. Expand
Deep Reinforcement Learning for Chatbots Using Clustered Actions and Human-Likeness Rewards
TLDR
This work trains Deep Reinforcement Learning (DRL) agents using chitchat data in raw text—without any manual annotations, and proposes a simple but promising reward function based on human-likeness scores derived from human-human dialogue data. Expand
Towards Scalable Information-Seeking Multi-Domain Dialogue
TLDR
This work trains a sub-domain identifier neural network that learns which features are relevant to the current turn and the immediate future, thus filtering out irrelevant information from the ontology and consequently the belief space at each dialogue turn. Expand
...
1
2
3
4
...

References

SHOWING 1-10 OF 44 REFERENCES
Policy Networks with Two-Stage Training for Dialogue Systems
TLDR
This paper shows that, on summary state and action spaces, deep Reinforcement Learning (RL) outperforms Gaussian Processes methods and shows that a deep RL method based on an actor-critic architecture can exploit a small amount of data very efficiently. Expand
Multi-domain dialogue success classifiers for policy training
TLDR
This work proposes a method for constructing dialogue success classifiers that are capable of making accurate predictions in domains unseen during training and demonstrates that these initial policy training results obtained with a simulated user carry over to learning from paid human users. Expand
Dialogue Management based on Multi-domain Corpus
TLDR
This paper divides Dialogue Act (DA), as semantic representation of utterance, into DA type and slot parameter, where the former is domain-independent and the latter one isdomain-specific, and generates the Multi-domain Corpus based Dialogue Management (MCDM) scheme. Expand
Policy Learning for Domain Selection in an Extensible Multi-domain Spoken Dialogue System
TLDR
The experimental results suggest that the proposed model marginally outperforms a non-trivial baseline and it is shown that by using a model parameter tying trick, the extensibility of the system can be preserved, where dialogue components in new domains can be easily plugged in, without re-training the domain selection policy. Expand
Policy committee for adaptation in multi-domain spoken dialogue systems
TLDR
Inspired by Bayesian committee machines, this paper proposes the use of a committee of dialogue policies, and shows that such a model is particularly beneficial for adaptation in multi-domain dialogue systems. Expand
Evaluation of a hierarchical reinforcement learning spoken dialogue system
TLDR
Experimental results in the travel planning domain provide evidence to support the following claims: (a) hierarchical semi-learnt dialogue agents are a better alternative (with higher overall performance) than deterministic or fully-learner behaviour; (b) spoken dialogue strategies learnt with highly coherent user behaviour and conservative recognition error rates can outperform a reasonable hand-coded strategy. Expand
Strategic Dialogue Management via Deep Reinforcement Learning
TLDR
A successful application of Deep Reinforcement Learning with a high-dimensional state space to the strategic board game of Settlers of Catan is described, which supports the claim that DRL is a promising framework for training dialogue systems, and strategic agents with negotiation abilities. Expand
Towards End-to-End Learning for Dialog State Tracking and Management using Deep Reinforcement Learning
This paper presents an end-to-end framework for task-oriented dialog systems using a variant of Deep Recurrent Q-Networks (DRQN). The model is able to interface with a relational database and jointlyExpand
Hierarchical reinforcement learning for situated natural language generation
TLDR
A novel approach for situated Natural Language Generation in dialogue that is based on hierarchical reinforcement learning and learns the best utterance for a context by optimisation through trial and error is presented. Expand
Nonstrict Hierarchical Reinforcement Learning for Interactive Systems and Robots
TLDR
A novel approach for dialogue policy optimization that combines the benefits of both hierarchical control and function approximation and that allows flexible transitions between dialogue subtasks to give human users more control over the dialogue is presented. Expand
...
1
2
3
4
5
...