• Publications
  • Influence
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
TLDR
An adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination is presented.
The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems
This paper introduces the Ubuntu Dialogue Corpus, a dataset containing almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words. This provides a unique
A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues
TLDR
A neural network-based generative architecture, with latent stochastic variables that span a variable number of time steps, that improves upon recently proposed models and that the latent variables facilitate the generation of long outputs and maintain the context.
How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation
TLDR
This work investigates evaluation metrics for dialogue response generation systems where supervised labels, such as task completion, are not available and shows that these metrics correlate very weakly with human judgements in the non-technical Twitter domain, and not at all in the technical Ubuntu domain.
An Actor-Critic Algorithm for Sequence Prediction
TLDR
An approach to training neural networks to generate sequences using actor-critic methods from reinforcement learning (RL) that condition the critic network on the ground-truth output, and shows that this method leads to improved performance on both a synthetic task, and for German-English machine translation.
Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses
TLDR
An evaluation model (ADEM) that learns to predict human-like scores to input responses, using a new dataset of human response scores and it is shown that the ADEM model's predictions correlate significantly, and at a level much higher than word-overlap metrics such as BLEU, with human judgements at both the utterance and system-level.
The Second Conversational Intelligence Challenge (ConvAI2)
TLDR
To improve performance on multi-turn conversations with humans, future systems must go beyond single word metrics like perplexity to measure the performance across sequences of utterances (conversations)—in terms of repetition, consistency and balance of dialogue acts.
On the Pitfalls of Measuring Emergent Communication
TLDR
By training deep reinforcement learning agents to play simple matrix games augmented with a communication channel, this paper finds a scenario where agents appear to communicate, and yet the messages do not impact the environment or other agent in any way.
A Survey of Available Corpora for Building Data-Driven Dialogue Systems
TLDR
A wide survey of publicly available datasets suitable for data-driven learning of dialogue systems is carried out and important characteristics of these datasets are discussed and how they can be used to learn diverse dialogue strategies.
Recursively Summarizing Books with Human Feedback
TLDR
This method combines learning from human feedback with recursive task decomposition: it uses models trained on smaller parts of the task to assist humans in giving feedback on the broader task, and generates sensible summaries of entire books.
...
...