Reward Shaping with Recurrent Neural Networks for Speeding up On-Line Policy Learning in Spoken Dialogue Systems

@inproceedings{Su2015RewardSW,
  title={Reward Shaping with Recurrent Neural Networks for Speeding up On-Line Policy Learning in Spoken Dialogue Systems},
  author={Pei-hao Su and David Vandyke and Milica Gasic and Nikola Mrksic and Tsung-Hsien Wen and Steve J. Young},
  booktitle={SIGDIAL Conference},
  year={2015}
}
Statistical spoken dialogue systems have the attractive property of being able to be optimised from data via interactions with real users. However in the reinforcement learning paradigm the dialogue manager (agent) often requires significant time to explore the state-action space to learn to behave in a desirable manner. This is a critical issue when the system is trained on-line with real users where learning costs are expensive. Reward shaping is one promising technique for addressing these… CONTINUE READING

Citations

Publications citing this paper.
SHOWING 1-10 OF 18 CITATIONS

References

Publications referenced by this paper.
SHOWING 1-10 OF 22 REFERENCES

Distributed dialogue policies for multi-domain statistical dialogue management

  • 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2015
VIEW 2 EXCERPTS

Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding

  • IEEE/ACM Transactions on Audio, Speech, and Language Processing
  • 2015
VIEW 1 EXCERPT

Gaussian Processes for POMDP-Based Dialogue Manager Optimization

  • IEEE/ACM Transactions on Audio, Speech, and Language Processing
  • 2014