Multi-Granularity Representations of Dialog

  title={Multi-Granularity Representations of Dialog},
  author={Shikib Mehri and Maxine Esk{\'e}nazi},
Neural models of dialog rely on generalized latent representations of language. This paper introduces a novel training procedure which explicitly learns multiple representations of language at several levels of granularity. The multi-granularity training algorithm modifies the mechanism by which negative candidate responses are sampled in order to control the granularity of learned latent representations. Strong performance gains are observed on the next utterance retrieval task using both the… 

Tables from this paper

Recent Advances and Challenges in Task-oriented Dialog System

Three critical topics for task-oriented dialog systems are discussed: improving data efficiency to facilitate dialog modeling in low-resource settings, modeling multi-turn dynamics for dialog policy learning to achieve better task-completion performance, and integrating domain ontology knowledge into the dialog model.

YNU-HPCC at SemEval-2020 Task 10: Using a Multi-granularity Ordinal Classification of the BiLSTM Model for Emphasis Selection

A multi-granularity ordinal classification method to address the problem of emphasis selection and the word embedding is learned from Embeddings from Language Model to extract feature vector representation.

Research on Dual-Dimensional Entity Association-Based Question and Answering Technology for Smart Medicine

A question answering method based on dual-dimensional entity association for intelligent medicine is proposed, which learns semantics from the dual-dimension of question and answer respectively.

Medical Term and Status Generation From Chinese Clinical Dialogue With Multi-Granularity Transformer

A Multi-granularity Transformer (MGT) model is proposed to enhance the dialogue context understanding from multi-granular features, and incorporates word-level information by adapting a Lattice-based encoder with the proposed relative position encoding method.



Pretraining Methods for Dialog Context Representation Learning

This paper examines various unsupervised pretraining objectives for learning dialog context representations. Two novel methods of pretraining dialog context encoders are proposed, and a total of four

Deep Contextualized Word Representations

A new type of deep contextualized word representation is introduced that models both complex characteristics of word use and how these uses vary across linguistic contexts, allowing downstream models to mix different types of semi-supervision signals.

The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems

This paper introduces the Ubuntu Dialogue Corpus, a dataset containing almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words. This provides a unique

Multi-view Response Selection for Human-Computer Conversation

A multi-view response selection model that integrates information from two different views, i.e., word sequence view and utterance sequence view is proposed, which significantly outperforms other single-view baselines.

Skip-Thought Vectors

We describe an approach for unsupervised learning of a generic, distributed sentence encoder. Using the continuity of text from books, we train an encoder-decoder model that tries to reconstruct the

Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network

This paper investigates matching a response with its multi-turn context using dependency information based entirely on attention using Transformer in machine translation and extends the attention mechanism in two ways, which jointly introduce those two kinds of attention in one uniform neural network.

Improving Language Understanding by Generative Pre-Training

The general task-agnostic model outperforms discriminatively trained models that use architectures specifically crafted for each task, improving upon the state of the art in 9 out of the 12 tasks studied.

Sequential Matching Network: A New Architecture for Multi-turn Response Selection in Retrieval-Based Chatbots

A sequential matching network (SMN) first matches a response with each utterance in the context on multiple levels of granularity, and distills important matching information from each pair as a vector with convolution and pooling operations.

GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding

A benchmark of nine diverse NLU tasks, an auxiliary dataset for probing models for understanding of specific linguistic phenomena, and an online platform for evaluating and comparing models, which favors models that can represent linguistic knowledge in a way that facilitates sample-efficient learning and effective knowledge-transfer across tasks.

TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents

A new approach to generative data-driven dialogue systems (e.g. chatbots) called TransferTransfo is introduced which is a combination of a Transfer learning based training scheme and a high-capacity Transformer model which shows strong improvements over the current state-of-the-art end-to-end conversational models.