Uncertainty Measures in Neural Belief Tracking and the Effects on Dialogue Policy Performance

@inproceedings{Niekerk2021UncertaintyMI,
  title={Uncertainty Measures in Neural Belief Tracking and the Effects on Dialogue Policy Performance},
  author={Carel van Niekerk and Andrey Malinin and Christian Geishauser and Michael Heck and Hsien-chin Lin and Nurul Lubis and Shutong Feng and Milica Gavsi'c},
  booktitle={EMNLP},
  year={2021}
}
The ability to identify and resolve uncertainty is crucial for the robustness of a dialogue system. Indeed, this has been confirmed empirically on systems that utilise Bayesian approaches to dialogue belief tracking. However, such systems consider only confidence estimates and have difficulty scaling to more complex settings. Neural dialogue systems, on the other hand, rarely take uncertainties into account. They are therefore overconfident in their decisions and less robust. Moreover, the… 

Figures and Tables from this paper

Dynamic Dialogue Policy Transformer for Continual Reinforcement Learning
TLDR
The dynamic dialogue policy transformer (DDPT) is proposed, a novel dynamic architecture that can integrate new knowledge seamlessly, is capable of handling large state spaces and obtains significant zero-shot performance when being exposed to unseen domains, without any growth in network parameter size.
"Do you follow me?": A Survey of Recent Approaches in Dialogue State Tracking
TLDR
It is argued that some critical aspects of dialogue systems such as generalizability are still underexplored and to motivate future studies, several research avenues are proposed.

References

SHOWING 1-10 OF 51 REFERENCES
Gaussian Processes for POMDP-Based Dialogue Manager Optimization
  • Milica GasicS. Young
  • Computer Science
    IEEE/ACM Transactions on Audio, Speech, and Language Processing
  • 2014
TLDR
It is shown that GP policy optimization can be implemented for a real world POMDP dialog manager, and it is demonstrated that designer effort can be substantially reduced by basing the policy directly on the full belief space thereby avoiding ad hoc feature space modeling.
Sample Efficient Deep Reinforcement Learning for Dialogue Systems With Large Action Spaces
TLDR
It is found that ACER trains significantly faster than the current state of the art in deep learning approaches for spoken dialogue systems, and is tested in a very large action space, which has two orders of magnitude more actions than previously considered.
Training and Evaluation of the HIS POMDP Dialogue System in Noise
TLDR
The results obtained from a user trial show that the HIS system with a trained policy performed significantly better than the MDP baseline, and the inherent ability to model uncertainty, allows the POMDP model to exploit alternative hypotheses from the speech understanding system.
Natural actor and belief critic: Reinforcement algorithm for learning parameters of dialogue systems modelled as POMDPs
TLDR
A novel algorithm for learning parameters in statistical dialogue systems which are modeled as Partially Observable Markov Decision Processes (POMDPs), which shows that model parameters estimated to maximize the expected cumulative reward result in significantly improved performance compared to the baseline hand-crafted model parameters.
Knowing What You Know: Calibrating Dialogue Belief State Distributions via Ensembles
TLDR
This work presents state-of-the-art performance in calibration for multi-domain dialogue belief trackers using a calibrated ensemble of models and outperforms previous dialogue belief tracking models in terms of accuracy.
Evaluation of Statistical POMDP-Based Dialogue Systems in Noisy Environments
TLDR
The deployment of a real-world restaurant information system and its evaluation in a motor car using subjects recruited locally and by remote users recruited using Amazon Mechanical Turk are described.
Partially observable Markov decision processes for spoken dialog systems
Guided Dialogue Policy Learning without Adversarial Learning in the Loop
TLDR
The proposed decompose the adversarial training into two steps, which achieves a remarkable task success rate using both on-policy and off-policy reinforcement learning methods and has potential to transfer knowledge from existing domains to a new domain.
Deep Neural Network Approach for the Dialog State Tracking Challenge
TLDR
The paper explores some aspects of the training, and the resulting tracker is found to perform competitively, particularly on a corpus of dialogs from a system not found in the training.
...
...