• Corpus ID: 238744355

Teaching Models new APIs: Domain-Agnostic Simulators for Task Oriented Dialogue

  title={Teaching Models new APIs: Domain-Agnostic Simulators for Task Oriented Dialogue},
  author={Moya Chen and Paul A. Crook and Stephen Roller},
We demonstrate that large language models are able to simulate Task Oriented Dialogues in novel domains, provided only with an API implementation and a list of goals. We show these simulations can formulate online, automatic metrics that correlate well with human evaluations. Furthermore, by checking for whether the User’s goals are met, we can use simulation to repeatedly generate training data and improve the quality of simulations themselves. With no human intervention or domainspecific… 
CheckDST: Measuring Real-World Generalization of Dialogue State Tracking Performance
It is argued that models should be assessed more holistically rather than pursuing state-of-the-art on JGA since a higher JGA does not guarantee better overall robustness, and a collection of metrics called CheckDST is designed that facilitate comparisons of DST models on comprehensive dimensions of robustness by testing well-known weaknesses with augmented test sets.


A User Simulator for Task-Completion Dialogues
A new, publicly available simulation framework, where the simulator, designed for the movie-booking domain, leverages both rules and collected data, and several agents are demonstrated and the procedure to add and test your own agent is detailed.
Alexa Conversations: An Extensible Data-driven Approach for Building Task-oriented Dialogue Systems
This work presents Alexa Conversations, a new approach for building goal-oriented dialogue systems that is scalable, extensible as well as data efficient, and provides out-of-the-box support for natural conversational phenomenon like entity sharing across turns or users changing their mind during conversation without requiring developers to provide any such dialogue flows.
How to Build User Simulators to Train RL-based Dialog Systems
A method of standardizing user simulator building is proposed that can be used by the community to compare dialog system quality using the same set of user simulators fairly and asks human users to assess the simulators directly and indirectly by rating the simulated dialogs and interacting with the trained systems.
Towards Scalable Multi-domain Conversational Agents: The Schema-Guided Dialogue Dataset
This work introduces the the Schema-Guided Dialogue (SGD) dataset, containing over 16k multi-domain conversations spanning 16 domains, and presents a schema-guided paradigm for task-oriented dialogue, in which predictions are made over a dynamic set of intents and slots provided as input.
Quantitative Evaluation of User Simulation Techniques for Spoken Dialogue Systems
A systematic approach to testing user simulations and assesses the most prominent domain-independent techniques using a large DARPA Communicator corpus of human-computer dialogues shows that simple statistical metrics are still sufficient to discern synthetic from real dialogues.
MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems
This paper introduces Levenshtein belief spans (Lev), that allows efficient dialogue state tracking with a minimal generation length, and greatly improves the inference efficiency of MinTL-based systems.
End-to-End Learning of Task-Oriented Dialogs
This thesis proposal designs neural network based dialog system that is able to robustly track dialog state, interface with knowledge bases, and incorporate structured query results into system responses to successfully complete task-oriented dialog.
Bootstrapping a Neural Conversational Agent with Dialogue Self-Play, Crowdsourcing and On-Line Reinforcement Learning
This paper discusses the advantages of this approach for industry applications of conversational agents, wherein an agent can be rapidly bootstrapped to deploy in front of users and further optimized via interactive learning from actual users of the system.
Multi-Domain Task-Completion Dialog Challenge
The goal is to investigate whether sample complexity can decrease with time, i.e., if a dialog system that was trained on a large corpus can learn to converse about a new domain given a much smaller in-domain corpus.
Microsoft Dialogue Challenge: Building End-to-End Task-Completion Dialogue Systems
This proposal introduces a Dialogue Challenge for building end-to-end task-completion dialogue systems, with the goal of encouraging the dialogue research community to collaborate and benchmark on