A Short Survey of Pre-trained Language Models for Conversational AI-A New Age in NLP

  title={A Short Survey of Pre-trained Language Models for Conversational AI-A New Age in NLP},
  author={Munazza Zaib and Quan Z. Sheng and W. Zhang},
  journal={Proceedings of the Australasian Computer Science Week Multiconference},
Building a dialogue system that can communicate naturally with humans is a challenging yet interesting problem of agent-based computing. The rapid growth in this area is usually hindered by the long-standing problem of data scarcity as these systems are expected to learn syntax, grammar, decision making, and reasoning from insufficient amounts of task-specific dataset. The recently introduced pre-trained language models have the potential to address the issue of data scarcity and bring… 

Figures from this paper

Advances in Multi-turn Dialogue Comprehension: A Survey

The characteristics and challenges of dialogue comprehension in contrast to plaintext reading comprehension are summarized and three typical patterns of dialogue modeling that are widely-used in dialogue comprehension tasks such as response selection and conversation questionanswering are discussed.

technical review on knowledge intensive NLP for pre-trained language development

The present progress of pre-trained language model-based knowledge-enhanced models (PLMKEs) is described by deconstructing their three key elements: information sources, knowledge-intensive NLP tasks, and knowledge fusion methods.

Conversational Question Answering: A Survey

There has been a trend shift from single-turn to multi-turn QA which empowers the field of Conversational AI from different perspectives, and this survey is intended to provide an epitome for the research community with the hope of laying a strong foundation for theField of CQA.

Fusing Sentence Embeddings Into LSTM-based Autoregressive Language Models

An LSTM-based autoregressive language model which uses pre-trained on text embeddings from a pretrained masked language model via fusion (e.g. concatenation) to obtain a richer context representation for language modelling to improve the perplexity.

Towards a Universal NLG for Dialogue Systems and Simulators with Future Bridging

A prototype FBNLG is evaluated to show that future bridging can be a viable approach to a universal few-shot NLG for task-oriented and chit-chat dialogues.

Pretrained Language Models for Text Generation: A Survey

This paper presents an overview of the major advances achieved in the topic of pretrained language models for text generation and discusses how to adapt existing PLMs to model different input data and satisfy special properties in the generated text.

A Survey of Pretrained Language Models Based Text Generation

This survey presents the recent advances achieved in the topic of PLMs for text generation and introduces three key points of applying PLMs to text generation: how to encode the input data as representations preserving input semantics which can be fused into PLMs.

Neurosymbolic AI for Situated Language Understanding

This model reincorporates some ideas of classic AI into a framework of neurosymbolic intelligence, using multimodal contextual modeling of interactive situations, events, and object properties, to solve a variety of AI learning challenges.

Do We Still Need Human Assessors? Prompt-Based GPT-3 User Simulation in Conversational AI

It is shown that in situations where very little data and resources are available, classifiers trained on such synthetically generated data might be preferable to the collection and annotation of naturalistic data.

Smoothing Dialogue States for Open Conversational Machine Reading

This work proposes an effective gating strategy by smoothing the two dialogue states in only one decoder and bridge decision making and question generation to provide a richer dialogue state reference and achieves new state-of-the-art results.



Hello, It’s GPT-2 - How Can I Help You? Towards the Use of Pretrained Language Models for Task-Oriented Dialogue Systems

This paper proposes a task-oriented dialogue model that operates solely on text input: it effectively bypasses explicit policy and language generation modules and holds promise to mitigate the data scarcity problem, and to support the construction of more engaging and more eloquent task- oriented conversational agents.

Improving Language Understanding by Generative Pre-Training

The general task-agnostic model outperforms discriminatively trained models that use architectures specifically crafted for each task, improving upon the state of the art in 9 out of the 12 tasks studied.

Deep Contextualized Word Representations

A new type of deep contextualized word representation is introduced that models both complex characteristics of word use and how these uses vary across linguistic contexts, allowing downstream models to mix different types of semi-supervision signals.

QuAC: Question Answering in Context

QuAC introduces challenges not found in existing machine comprehension datasets: its questions are often more open-ended, unanswerable, or only meaningful within the dialog context, as it shows in a detailed qualitative evaluation.

MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling

The Multi-Domain Wizard-of-Oz dataset (MultiWOZ), a fully-labeled collection of human-human written conversations spanning over multiple domains and topics is introduced, at a size of 10k dialogues, at least one order of magnitude larger than all previous annotated task-oriented corpora.

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

CoQA: A Conversational Question Answering Challenge

CoQA is introduced, a novel dataset for building Conversational Question Answering systems and it is shown that conversational questions have challenging phenomena not present in existing reading comprehension datasets (e.g., coreference and pragmatic reasoning).

Semantics-aware BERT for Language Understanding

This work proposes to incorporate explicit contextual semantics from pre-trained semantic role labeling, and introduces an improved language representation model, Semantics-aware BERT (SemBERT), which is capable of explicitly absorbing contextual semantics over a BERT backbone.

A Simple but Effective Method to Incorporate Multi-turn Context with BERT for Conversational Machine Comprehension

A simple but effective method with BERT for CMC that uses BERT to encode a paragraph independently conditioned with each question and each answer in a multi-turn context and finds that the gold answer history contributed to the model performance most on both datasets.

Neural Approaches to Conversational AI

This tutorial surveys neural approaches to conversational AI that were developed in the last few years, and presents a review of state-of-the-art neural approaches, drawing the connection between neural approaches and traditional symbolic approaches.