Speaker Turn Modeling for Dialogue Act Classification

  title={Speaker Turn Modeling for Dialogue Act Classification},
  author={Zihao He and Leili Tavabi and Kristina Lerman and M. Soleymani},
Dialogue Act (DA) classification is the task of classifying utterances with respect to the function they serve in a dialogue. Existing approaches to DA classification model utterances without incorporating the turn changes among speakers throughout the dialogue, therefore treating it no different than non-interactive written text. In this paper, we propose to integrate the turn changes in conversations among speakers when modeling DAs. Specifically, we learn conversation-invariant speaker turn… 

Figures and Tables from this paper

Weakly Supervised Data Augmentation Through Prompting for Dialogue Understanding

This work explores few-shot data augmentation for dialogue understanding by prompting large pre-trained language models and presents a novel approach that iterates on augmentation quality by applying weakly-supervised agents.

Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis

Experimental results show that using the proposed formulation can outperform the state-of-the-art methods based on target speaker voice activity detection, and the performance can be further improved with SOND, resulting in a 6.30% relative diarization error reduction.



A Context-based Approach for Dialogue Act Recognition using Simple Recurrent Neural Networks

This work proposes a novel context-based learning method to classify dialogue acts using a character-level language model utterance representation, and shows significant improvement in dialogue act detection.

A Dual-Attention Hierarchical Recurrent Neural Network for Dialogue Act Classification

Experimental results show that by modelling topic as an auxiliary task, the proposed dual-attention hierarchical recurrent neural network can significantly improve DA classification, yielding better or comparable performance to the state-of-the-art method on three public datasets.

Dialogue act modeling for automatic tagging and recognition of conversational speech

A probabilistic integration of speech recognition with dialogue modeling is developed, to improve both speech recognition and dialogue act classification accuracy.

Dialogue Act Sequence Labeling using Hierarchical encoder with CRF

A hierarchical recurrent neural network is built using bidirectional LSTM as a base unit and the conditional random field as the top layer to classify each utterance into its corresponding dialogue act, thus modeling the dependency among both, labels and utterances, an important consideration of natural dialogue.

Dialogue Act Recognition via CRF-Attentive Structured Network

This paper tackles the problem of DAR from the viewpoint of extending richer Conditional Random Field (CRF) structured dependencies without abandoning end-to-end training and incorporates hierarchical semantic inference with memory mechanism on the utterance modeling at multiple levels.

Dialogue Act Classification with Context-Aware Self-Attention

This work exploits the effectiveness of a context-aware self-attention mechanism coupled with a hierarchical recurrent neural network to solve the Dialogue Act classification problem as a sequence labeling problem using hierarchical deep neural networks.

Speaker Role Contextual Modeling for Language Understanding and Dialogue Policy Learning

A role-based contextual model is proposed to consider different speaker roles independently based on the various speaking patterns in the multi-turn dialogues to improve language understanding and dialogue policy learning tasks.

Improved Dynamic Memory Network for Dialogue Act Classification with Adversarial Training

This paper first cast the problem into a question and answering problem and proposed an improved dynamic memory networks with hierarchical pyramidal utterance encoder, which is not only robust, but also achieves better performance when compared with some state-of-the-art baselines.

Dynamic time-aware attention to speaker roles and contexts for spoken language understanding

This paper proposes an attention-based network that additionally leverages temporal information and speaker role for better SLU, where the attention to contexts and speaker roles can be automatically learned in an end-to-end manner.

Conversational Memory Network for Emotion Recognition in Dyadic Dialogue Videos

A deep neural framework is proposed, termed conversational memory network, which leverages contextual information from the conversation history to recognize utterance-level emotions in dyadic conversational videos.