• Corpus ID: 196470811

Effective Incorporation of Speaker Information in Utterance Encoding in Dialog

  title={Effective Incorporation of Speaker Information in Utterance Encoding in Dialog},
  author={Tianyu Zhao and Tatsuya Kawahara},
In dialog studies, we often encode a dialog using a hierarchical encoder where each utterance is converted into an utterance vector, and then a sequence of utterance vectors is converted into a dialog vector. Since knowing who produced which utterance is essential to understanding a dialog, conventional methods tried integrating speaker labels into utterance vectors. We found the method problematic in some cases where speaker annotations are inconsistent among different dialogs. A relative… 
A Speaker-aware Parallel Hierarchical Attentive Encoder-Decoder Model for Multi-turn Dialogue Generation
A speaker-aware Parallel Hierarchical Attentive Encoder-Decoder (PHAED) model that aims to model each utterance with the awareness of its speaker and contextual associations with the same speaker’s previous messages is proposed.
Who did They Respond to? Conversation Structure Modeling using Masked Hierarchical Transformer
A novel masking mechanism is designed to guide the ancestor flow, and the proposed model, that takes into account the ancestral history of the conversation, significantly outperforms several strong baselines including the BERT model on all datasets.
A Tailored Pre-Training Model for Task-Oriented Dialog Generation
A Pre-trained Role Alternating Language model (PRAL), designed specifically for task-oriented conversational systems, that models two speakers separately and shows that PRAL performs better or on par with state-of-the-art methods.
Designing Precise and Robust Dialogue Response Evaluators
This work proposes to build a reference-free evaluator and exploit the power of semi-supervised training and pretrained (masked) language models and achieves a strong correlation with human judgement and generalizes robustly to diverse responses and corpora.
Alternating Recurrent Dialog Model with Large-scale Pre-trained Language Models
Alternating Recurrent Dialog Model (ARDM) is a simple, general, and effective framework that outperforms or is on par with state-of-the-art methods on two popular task-oriented dialog datasets: CamRest676 and MultiWOZ and can generalize to more challenging, non-collaborative tasks such as persuasion.


Dialogue Act Classification with Context-Aware Self-Attention
This work exploits the effectiveness of a context-aware self-attention mechanism coupled with a hierarchical recurrent neural network to solve the Dialogue Act classification problem as a sequence labeling problem using hierarchical deep neural networks.
Dynamic time-aware attention to speaker roles and contexts for spoken language understanding
This paper proposes an attention-based network that additionally leverages temporal information and speaker role for better SLU, where the attention to contexts and speaker roles can be automatically learned in an end-to-end manner.
Speaker Role Contextual Modeling for Language Understanding and Dialogue Policy Learning
A role-based contextual model is proposed to consider different speaker roles independently based on the various speaking patterns in the multi-turn dialogues to improve language understanding and dialogue policy learning tasks.
Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models
The recently proposed hierarchical recurrent encoder-decoder neural network is extended to the dialogue domain, and it is demonstrated that this model is competitive with state-of-the-art neural language models and back-off n-gram models.
Decay-Function-Free Time-Aware Attention to Context and Speaker Indicator for Spoken Language Understanding
Time-aware models that automatically learn the latent time-decay function of the history without a manual time-Decay function are proposed and a method to identify and label the current speaker to improve the SLU accuracy is proposed.
Consistent Dialogue Generation with Self-supervised Feature Learning
This paper proposes a neural conversation model that generates consistent responses by maintaining certain features related to topics and personas throughout the conversation by adopting a binary feature representation and introducing a feature disentangling loss.
Improved Dynamic Memory Network for Dialogue Act Classification with Adversarial Training
This paper first cast the problem into a question and answering problem and proposed an improved dynamic memory networks with hierarchical pyramidal utterance encoder, which is not only robust, but also achieves better performance when compared with some state-of-the-art baselines.
A Conditional Variational Framework for Dialog Generation
This paper proposes a framework allowing conditional response generation based on specific attributes, which can be either manually assigned or automatically detected and validated on two different scenarios, where the attribute refers to genericness and sentiment states respectively.
A Persona-Based Neural Conversation Model
This work presents persona-based models for handling the issue of speaker consistency in neural response generation that yield qualitative performance improvements in both perplexity and BLEU scores over baseline sequence-to-sequence models.
Addressee and Response Selection in Multi-Party Conversations with Speaker Interaction RNNs
Experimental results show that SI-RNN significantly improves the accuracy of addressee and response selection, particularly in complex conversations with many speakers and responses to distant messages many turns in the past.