Beyond Goldfish Memory: Long-Term Open-Domain Conversation

@inproceedings{Xu2022BeyondGM,
  title={Beyond Goldfish Memory: Long-Term Open-Domain Conversation},
  author={Jing Xu and Arthur D. Szlam and Jason Weston},
  booktitle={ACL},
  year={2022}
}
Despite recent improvements in open-domain dialogue models, state of the art models are trained and evaluated on short conversations with little context. In contrast, the long-term conversation setting has hardly been studied. In this work we collect and release a human-human dataset consisting of multiple chat sessions whereby the speaking partners learn about each other’s interests and discuss the things they have learnt from past sessions. We show how existing models trained on existing… 
Long Time No See! Open-Domain Conversation with Long-Term Persona Memory
TLDR
This is the first attempt to conduct real-time dynamic management of persona information of both parties, including the user and the bot, using a dialogue generation framework with Long-Term Memory (LTM) mechanism (called PLATO-LTM).
Pan More Gold from the Sand: Refining Open-domain Dialogue Training with Noisy Self-Retrieval Generation
TLDR
A retrieval-generation training framework that can increase the usage of training data by directly considering the heterogeneous and noisy training data as the "evidence" and has a positive correlation with the relevance of the retrieved evidence.
Few-Shot Bot: Prompt-Based Learning for Dialogue Systems
TLDR
An end-to-end chatbot named the Few-Shot Bot is created, which automatically selects the most appropriate conversational skill, queries different knowledge bases or the internet, and uses the retrieved knowledge to generate a human-like response, all using only few dialogue examples per skill.
What Did You Say? Task-Oriented Dialog Datasets Are Not Conversational!?
TLDR
This work outlines a taxonomy of conversational and contextual effects, which is used to examine MULTIWOZ, SGD and SMCALFLOW, among the most recent and widely used task-oriented dialog datasets, and outlines desiderata for truly conversational dialog datasets.
PANGUBOT: Efficient Generative Dialogue Pre-training from Pre-trained Language Model
TLDR
P AN G U -B OT’s response quality, knowledge correctness, and safety are still far from perfect, and further explorations are indispensable to building reliable and smart dialogue systems.
AfriWOZ: Corpus for Exploiting Cross-Lingual Transferability for Generation of Dialogues in Low-Resource, African Languages
TLDR
It is shown that the hypothesis that deep monolingual models learn some abstractions that generalize across languages holds, and the language with the most transferable properties is the Nigerian Pidgin English, with a human-likeness score of 78.1%.
Human Evaluation of Conversations is an Open Problem: comparing the sensitivity of various methods for evaluating dialogue agents
TLDR
Five different crowdworker-based human evaluation methods are compared and it is found that different methods are best depending on the types of models compared, with no clear winner across the board.
Towards Continual Knowledge Learning of Language Models
TLDR
This work constructs a new benchmark and metric to quantify the retention of time-invariant world knowledge, the update of outdated knowledge, and the acquisition of new knowledge in Continual Knowledge Learning.
State-of-the-art in Open-domain Conversational AI: A Survey
TLDR
Results of the survey show that progress has been made with recent SoTA conversational AI, but there are still persistent challenges that need to be solved, and the female gender is more common than the male for conversationalAI.
Generative Spoken Dialogue Language Modeling
TLDR
dGSLM is introduced, the first “textless” model able to generate audio samples of naturalistic spoken dialogues and reproduces naturalistic turn taking.
...
...

References

SHOWING 1-10 OF 41 REFERENCES
Improving Neural Conversational Models with Entropy-Based Data Filtering
TLDR
This work presents a method of filtering dialog datasets by removing generic utterances from training data using a simple entropy-based approach that does not require human supervision, and shows that training on datasets filtered this way results in better conversational quality as chatbots learn to output more diverse responses.
Wizard of Wikipedia: Knowledge-Powered Conversational agents
TLDR
The best performing dialogue models are able to conduct knowledgeable discussions on open-domain topics as evaluated by automatic metrics and human evaluations, while a new benchmark allows for measuring further improvements in this important research direction.
The Gutenberg Dialogue Dataset
TLDR
This work builds a high-quality dataset of 14.8M utterances in English, and smaller datasets in German, Dutch, Spanish, Portuguese, Italian, and Hungarian, and describes and analyzes the effects of the various heuristics used, and presents an error analysis of extracted dialogues.
Retrieval Augmentation Reduces Hallucination in Conversation
TLDR
This work explores the use of neural-retrieval-in-the-loop architectures recently shown to be effective in open-domain QA for knowledge-grounded dialogue, a task that is arguably more challenging as it requires querying based on complex multi-turn dialogue context and generating conversationally coherent responses.
Training Millions of Personalized Dialogue Agents
TLDR
A new dataset providing 5 million personas and 700 million persona-based dialogues is introduced and it is shown that, at this scale, training using personas still improves the performance of end-to-end systems.
Recipes for Building an Open-Domain Chatbot
TLDR
Human evaluations show the best models outperform existing approaches in multi-turn dialogue on engagingness and humanness measurements, and the limitations of this work are discussed by analyzing failure cases of the models.
Dial2Desc: End-to-end Dialogue Description Generation
TLDR
A new task named Dialogue Description is proposed, which takes a dialogue text as input, then outputs a concise description of the object or the action involved in this conversation, and it is demonstrated that one can get more accurate and descriptive results using a new neural attentive model.
Towards Empathetic Open-domain Conversation Models: A New Benchmark and Dataset
TLDR
This work proposes a new benchmark for empathetic dialogue generation and EmpatheticDialogues, a novel dataset of 25k conversations grounded in emotional situations, and presents empirical comparisons of dialogue model adaptations forEmpathetic responding, leveraging existing models or datasets without requiring lengthy re-training of the full model.
Towards a Human-like Open-Domain Chatbot
TLDR
Meena, a multi-turn open-domain chatbot trained end-to-end on data mined and filtered from public domain social media conversations, is presented and a human evaluation metric called Sensibleness and Specificity Average (SSA) is proposed, which captures key elements of a human-like multi- turn conversation.
Dense Passage Retrieval for Open-Domain Question Answering
TLDR
This work shows that retrieval can be practically implemented using dense representations alone, where embeddings are learned from a small number of questions and passages by a simple dual-encoder framework.
...
...