Context-Sensitive and Role-Dependent Spoken Language Understanding Using Bidirectional and Attention LSTMs

  title={Context-Sensitive and Role-Dependent Spoken Language Understanding Using Bidirectional and Attention LSTMs},
  author={Chiori Hori and Takaaki Hori and Shinji Watanabe and John R. Hershey},
To understand speaker intentions accurately in a dialog, it is important to consider the context of the surrounding sequence of dialog turns. Furthermore, each speaker may play a different role in the conversation, such as agent versus client, and thus features related to these roles may be important to the context. In previous work, we proposed context-sensitive spoken language understanding (SLU) using role-dependent long short-term memory (LSTM) recurrent neural networks (RNNs), and showed… 
Role Play Dialogue Aware Language Models Based on Conditional Hierarchical Recurrent Encoder-Decoder
This work proposes role play dialogue-aware language models (RPDALMs) that can leverage interactive contexts in role play multiturn dialogues for estimating the generative probability of words and compose the RPDA-LMs by extending hierarchical recurrent encoder-decoder modeling so as to handle the role information.
Speaker-sensitive dual memory networks for multi-turn slot tagging
A neural architecture with Speaker-Sensitive Dual Memory Networks which encode utterances differently depending on the speaker which shows a significant performance improvement over the state-of-the-art slot tagging models using contextual information.
Identifying Domain Independent Update Intents in Task Based Dialogs
This paper builds a multi-class classification model using LSTM’s to identify the type of UI in user utterances in the Restaurant and Shopping domains and proposes a new type of semantic class for user intents, Update Intents, which is directly related to thetype of update a user intends to perform for a slot-value pair.
Dialog state tracking with attention-based sequence-to-sequence learning
An advanced dialog state tracking system designed for the 5th Dialog State Tracking Challenge (DSTC5) is presented, which includes an encoder-decoder architecture with an attention mechanism to map an input word sequence to a set of semantic labels, i.e., slot-value pairs.
May I take your order? A Neural Model for Extracting Structured Information from Conversations
This paper develops a sequence-to-sequence model that is able to map from unstructured conversational input to the structured form that is conveyed to the kitchen and appears on the customer receipt.
Joint Intention Detection and Semantic Slot Filling Based on BLSTM and Attention
A Bidirectional long short-term memory model based on the attention mechanism is used to jointly identify the intent and semantic slot filling of the Hohhot bus query and the experimental results show that the model achieves a good performance in the intent detection and semantic slots filling.
Efficient Large-Scale Neural Domain Classification with Personalized Attention
This paper proposes a scalable neural model architecture with a shared encoder, a novel attention mechanism that incorporates personalization information and domain-specific classifiers that solves the problem efficiently and demonstrates that incorporating personalization significantly improves domain classification accuracy in a setting with thousands of overlapping domains.
Prosodic Plot of Dialogues: A Conceptual Framework to Trace Speakers' Role
A unified set of Map Task dialogues that are unique in the sense that each speaker participated twice –once as a follower and once as a leader, with the same interlocutor playing the other role is used to analyze structures of dialogues.
A Scalable Neural Shortlisting-Reranking Approach for Large-Scale Domain Classification in Natural Language Understanding
A set of efficient and scalable shortlisting-reranking neural models for effective large-scale domain classification for IPDAs and shows the effectiveness of the approach with extensive experiments on 1,500 IPDA domains.
A Framework for pre-training hidden-unit conditional random fields and its extension to long short term memory networks
A simple unsupervised framework for pre-training hidden-unit conditional random fields (HUCRFs) by using the separation of HUCRF parameters between observations and labels to pre-train observation parameters independently of label parameters.


Context Sensitive Spoken Language Understanding Using Role Dependent LSTM Layers
Neural network models have become a recent focus of investigation in spoken language understanding (SLU). To understand speaker intentions accurately in a dialog, it is important to consider the
Contextual spoken language understanding using recurrent neural networks
The proposed method obtains new state-of-the-art results on ATIS and improved performances over baseline techniques such as conditional random fields (CRFs) on a large context-sensitive SLU dataset.
Efficient learning for spoken language understanding tasks with word embedding based pre-training
This paper investigates the use of unsupervised training methods with large-scale corpora based on word embedding and latent topic models to pre-train the SLU networks and proposes a novel Recurrent Neural Network (RNN) architecture.
Spoken language understanding using long short-term memory neural networks
This paper investigates using long short-term memory (LSTM) neural networks, which contain input, output and forgetting gates and are more advanced than simple RNN, for the word labeling task and proposes a regression model on top of the LSTM un-normalized scores to explicitly model output-label dependence.
Recurrent neural networks for language understanding
This paper modify the architecture to perform Language Understanding, and advance the state-of-the-art for the widely used ATIS dataset.
Recurrent conditional random field for language understanding
This paper shows that the performance of an RNN tagger can be significantly improved by incorporating elements of the CRF model; specifically, the explicit modeling of output-label dependencies with transition features, its global sequence-level objective function, and offline decoding.
Attention-Based Models for Speech Recognition
The attention-mechanism is extended with features needed for speech recognition and a novel and generic method of adding location-awareness to the attention mechanism is proposed to alleviate the issue of high phoneme error rate.
The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems
This paper introduces the Ubuntu Dialogue Corpus, a dataset containing almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words. This provides a unique
Statistical dialog management applied to WFST-based dialog systems
An expandable dialog scenario description and platform to manage dialog systems using a weighted finite-state transducer (WFST) and it is confirmed the automatically acquired dialog scenario from a corpus can manage dialog reasonably on the WFST-based dialog management platform.
Sequence to Sequence Learning with Neural Networks
This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.