Context-aware Neural-based Dialog Act Classification on Automatically Generated Transcriptions
@article{Ortega2019ContextawareND, title={Context-aware Neural-based Dialog Act Classification on Automatically Generated Transcriptions}, author={Daniel Ortega and Chia-Yu Li and Gisela Vallejo and Pavel Denisov and Ngoc Thang Vu}, journal={ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, year={2019}, pages={7265-7269} }
This paper presents our latest investigations on dialog act (DA) classification on automatically generated transcriptions. [] Key Result Furthermore, they show that although the word error rates are comparable, End-to-End ASR system seems to be more suitable for DA classification.
10 Citations
Sentence encoding for Dialogue Act classification
- Computer ScienceNatural Language Engineering
- 2021
In this study, we investigate the process of generating single-sentence representations for the purpose of Dialogue Act (DA) classification, including several aspects of text pre-processing and…
Pretrained Semantic Speech Embeddings for End-to-End Spoken Language Understanding via Cross-Modal Teacher-Student Learning
- Computer ScienceINTERSPEECH
- 2020
A novel training method is proposed that enables pretrained contextual embeddings to process acoustic features and is based on the teacher-student framework across speech and text modalities that aligns the acoustic and the semantic latent spaces.
Towards Emotion-aided Multi-modal Dialogue Act Classification
- Computer ScienceACL
- 2020
It is shown empirically that multi-modality and multi-tasking achieve better performance of DAC compared to uni-modal and single task DAC variants, and builds an attention based multi- modal, multi-task Deep Neural Network for joint learning of DAs and emotions.
Emotion Aided Dialogue Act Classification for Task-Independent Conversations in a Multi-modal Framework
- Computer ScienceCognitive Computation
- 2020
A DL-based multi-tasking network for DAC and emotion recognition (ER) has been developed incorporating attention to facilitate the fusion of different modalities and the efficacy of the proposed approach and the importance of incorporating emotion while identifying the DAs are established.
Emotion Aided Dialogue Act Classification for Task-Independent Conversations in a Multi-modal Framework
- Computer ScienceCognitive Computation
- 2020
A DL-based multi-tasking network for DAC and emotion recognition (ER) has been developed incorporating attention to facilitate the fusion of different modalities and the efficacy of the proposed approach and the importance of incorporating emotion while identifying the DAs are established.
Dialogue History Integration into End-to-End Signal-to-Concept Spoken Language Understanding Systems
- Computer ScienceICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2020
This work investigates the embeddings for representing dialog history in spoken language understanding (SLU) systems and proposes three following types of h-vectors: supervised-all, supervised-freq, and unsupervised.
A Multilingual Neural Coaching Model with Enhanced Long-term Dialogue Structure
- Computer ScienceACM Trans. Interact. Intell. Syst.
- 2022
This work develops a fully data driven conversational agent capable of carrying out motivational coaching sessions in Spanish, French, Norwegian, and English and develops a global deep learning system which controls the long-term structure of the dialogue.
Research on an Improved CNN Speech Recognition System Based on Hidden Markov Model
- Computer Science2020 International Conference on Intelligent Transportation, Big Data & Smart City (ICITBS)
- 2020
The experimental results show that the improved CNN based on hidden markov model can complete the recognition function and verify the correctness of the algorithm.
ESPnet-SLU: Advancing Spoken Language Understanding Through ESPnet
- Computer ScienceICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2022
This work enhances the toolkit to provide implementations for various SLU benchmarks that enable researchers to seamlessly mix-and-match different ASR and NLU models, and provides pretrained models with intensively tuned hyper-parameters that can match or even outperform the current state-of-the-art performances.
Speaker and Time-aware Joint Contextual Learning for Dialogue-act Classification in Counselling Conversations
- Computer ScienceWSDM
- 2022
This work develops a novel dataset, named HOPE, to provide a platform for the dialogue-act classification in counselling conversations, identifies the requirement of such conversation, and proposes SPARTA, a transformer-based architecture with a novel speaker- and time-aware contextual learning for the Dialogue-Act classification.
References
SHOWING 1-10 OF 29 REFERENCES
Automatic dialog act segmentation and classification in multiparty meetings
- Computer ScienceProceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005.
- 2005
It is found that a very simple prosodic model aids performance over lexical information alone, especially for segmentation, in the two related tasks of dialog act segmentation and DA classification for speech from the ICSI Meeting Corpus.
Joint segmentation and classification of dialog acts using conditional random fields
- Computer ScienceINTERSPEECH
- 2009
Although the proposed framework is conceptually simpler than previous attempts at segmentation and classification of DAs it outperforms all previous systems for a task based on the ICSI (MRDA) meeting corpus.
Sequential Short-Text Classification with Recurrent and Convolutional Neural Networks
- Computer ScienceNAACL
- 2016
This work presents a model based on recurrent neural networks and convolutional neural networks that incorporates the preceding short texts that achieves state-of-the-art results on three different datasets for dialog act prediction.
Hybrid CTC-Attention based End-to-End Speech Recognition using Subword Units
- Computer Science2018 11th International Symposium on Chinese Spoken Language Processing (ISCSLP)
- 2018
This paper presents an end-to-end automatic speech recognition system, which successfully employs subword units in a hybrid CTC-Attention based system, obtained by the byte-pair encoding (BPE) compression algorithm.
EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding
- Computer Science2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)
- 2015
This paper presents the Eesen framework which drastically simplifies the existing pipeline to build state-of-the-art ASR systems and achieves comparable word error rates (WERs), while at the same time speeding up decoding significantly.
End-to-end attention-based large vocabulary speech recognition
- Computer Science2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2016
This work investigates an alternative method for sequence modelling based on an attention mechanism that allows a Recurrent Neural Network (RNN) to learn alignments between sequences of input frames and output labels.
The ICSI Meeting Recorder Dialog Act (MRDA) Corpus
- Computer ScienceSIGDIAL Workshop
- 2004
A new corpus of over 180,000 hand- annotated dialog act tags and accompanying adjacency pair annotations for roughly 72 hours of speech from 75 naturally-occurring meetings is described.
Globally Normalized Transition-Based Neural Networks
- Computer ScienceACL
- 2016
We introduce a globally normalized transition-based neural network model that achieves state-of-the-art part-of-speech tagging, dependency parsing and sentence compression results. Our model is a…
Recurrent Convolutional Neural Networks for Discourse Compositionality
- Computer ScienceCVSM@ACL
- 2013
The discourse model coupled to the sentence model obtains state of the art performance on a dialogue act classification experiment and is able to capture both the sequentiality of sentences and the interaction between different speakers.
Dialogue act modeling for automatic tagging and recognition of conversational speech
- Computer ScienceCL
- 2000
A probabilistic integration of speech recognition with dialogue modeling is developed, to improve both speech recognition and dialogue act classification accuracy.