Am I Me or You? State-of-the-Art Dialogue Models Cannot Maintain an Identity

  title={Am I Me or You? State-of-the-Art Dialogue Models Cannot Maintain an Identity},
  author={Kurt Shuster and Jack Urbanek and Arthur D. Szlam and Jason Weston},
State-of-the-art dialogue models still often stumble with regards to factual accuracy and self-contradiction. Anecdotally, they have been observed to fail to maintain character identity throughout discourse; and more specifically, may take on the role of their interlocutor. In this work we formalize and quantify this defi-ciency, and show experimentally through human evaluations that this is indeed a problem. In contrast, we show that discriminative models trained specifically to recognize who… 
Building a Role Specified Open-Domain Dialogue System Leveraging Large-Scale Language Models
This work proposes an efficient data collection framework leveraging in-context few-shot learning of large-scale language models for building role-satisfying dialogue dataset from scratch and compares various architectures for open-domain dialogue systems in terms of meeting role specifications while maintaining conversational abilities.
Investigating person-specific errors in chat-oriented dialogue systems
Creating chatbots to behave like real people is important in terms of believability. Errors in general chatbots and chatbots that follow a rough persona have been studied, but those in chatbots that
DIRECTOR: Generator-Classifiers For Supervised Language Modeling
A new architecture, DIRECTOR, that consists of a unified generatorclassifier with both a language modeling and a classification head for each output token that outperforms existing model guiding approaches in terms of both accuracy and efficiency is introduced.


Don’t Say That! Making Inconsistent Dialogue Unlikely with Unlikelihood Training
This work shows how all of the problems of generative dialogue models can be addressed by extending the recently introduced unlikelihood loss to these cases, and demonstrates the efficacy of this approach across several dialogue tasks.
Retrieval Augmentation Reduces Hallucination in Conversation
This work explores the use of neural-retrieval-in-the-loop architectures recently shown to be effective in open-domain QA for knowledge-grounded dialogue, a task that is arguably more challenging as it requires querying based on complex multi-turn dialogue context and generating conversationally coherent responses.
Personalizing Dialogue Agents: I have a dog, do you have pets too?
This work collects data and train models tocondition on their given profile information; and information about the person they are talking to, resulting in improved dialogues, as measured by next utterance prediction.
I like fish, especially dolphins: Addressing Contradictions in Dialogue Modeling
The DialoguE COntradiction DEtection task (DECODE) and a new conversational dataset containing both human-human and human-bot contradictory dialogues are introduced and it is shown that the best contradiction detection model correlates well with human judgments and is used in both automatically evaluating and improving the consistency of state-of-the-art generative chatbots.
Wizard of Wikipedia: Knowledge-Powered Conversational agents
The best performing dialogue models are able to conduct knowledgeable discussions on open-domain topics as evaluated by automatic metrics and human evaluations, while a new benchmark allows for measuring further improvements in this important research direction.
Training Millions of Personalized Dialogue Agents
A new dataset providing 5 million personas and 700 million persona-based dialogues is introduced and it is shown that, at this scale, training using personas still improves the performance of end-to-end systems.
Internet-Augmented Dialogue Generation
An approach that learns to generate an internet search query based on the context, and then conditions on the search results to finally generate a response, a method that can employ up-to-the-minute relevant information.
Learning to Speak and Act in a Fantasy Text Adventure Game
This work introduces a large-scale crowdsourced text adventure game as a research platform for studying grounded dialogue, and describes the results of training state-of-the-art generative and retrieval models in this setting.
Recipes for Building an Open-Domain Chatbot
Human evaluations show the best models outperform existing approaches in multi-turn dialogue on engagingness and humanness measurements, and the limitations of this work are discussed by analyzing failure cases of the models.
How Decoding Strategies Affect the Verifiability of Generated Text
This work investigates the verifiability of text generated by state-of-the-art pre-trained language models and discovers a tradeoff between factuality and repetitiveness.