Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding

@article{Mesnil2015UsingRN,
  title={Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding},
  author={Gr{\'e}goire Mesnil and Yann Dauphin and Kaisheng Yao and Yoshua Bengio and Li Deng and Dilek Z. Hakkani-T{\"u}r and Xiaodong He and Larry Heck and Gokhan Tur and Dong Yu and Geoffrey Zweig},
  journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing},
  year={2015},
  volume={23},
  pages={530-539}
}
  • G. Mesnil, Yann Dauphin, G. Zweig
  • Published 1 March 2015
  • Computer Science
  • IEEE/ACM Transactions on Audio, Speech, and Language Processing
Semantic slot filling is one of the most challenging problems in spoken language understanding (SLU). In this paper, we propose to use recurrent neural networks (RNNs) for this task, and present several novel architectures designed to efficiently model past and future temporal dependencies. Specifically, we implemented and compared several important RNN architectures, including Elman, Jordan, and hybrid variants. To facilitate reproducibility, we implemented these networks with the publicly… 

Figures and Tables from this paper

Effective Spoken Language Labeling with Deep Recurrent Neural Networks

TLDR
This paper designs new deep RNNs architectures for Spoken Language Understanding achieving state-of-the-art results on two widely used corpora for SLU: ATIS, in English, and MEDIA, in French.

Label-Dependency Coding in Simple Recurrent Networks for Spoken Language Understanding

TLDR
It is shown that modeling label dependencies is useless on the (simple) ATIS database and unstructured models can produce state-of-the-art results on this benchmark, and that the modification introduced in the proposed RNN outperforms traditional RNNs and CRF models on the MEDIA benchmark.

Using Deep Time Delay Neural Network for Slot Filling in Spoken Language Understanding

TLDR
This work proposes to use a multi-layer Time Delay Neural Network (TDNN), which is prevalent in current large vocabulary continuous speech recognition tasks, and uses a TDNN with symmetric time delay offset, which achieves state-of-the-art results without any additional knowledge and data resources.

ClockWork-RNN Based Architectures for Slot Filling

TLDR
This paper proposes the use of ClockWork recurrent neural network (CW-RNN) architectures in the slot-filling domain and finds that they significantly outperform simple RNNs and they achieve results among the state-of-the-art, while retaining smaller complexity.

Recurrent Neural Network Structured Output Prediction for Spoken Language Understanding

TLDR
This work proposes to introduce label dependencies in model training by feeding previous output label, using a sampling approach on true and predicted labels, to the current sequence state, and shows that the proposed methods consistently outperform the baseline RNN system.

Exploring the use of Attention-Based Recurrent Neural Networks For Spoken Language Understanding

TLDR
First experiments carried on the ATIS corpus confirm the quality of the RNN baseline system used in this paper, and show that RNN based on a mechanism of attention performs better than RNN architectures recently proposed for a slot filling task.

Modeling with Recurrent Neural Networks for Open Vocabulary Slots

TLDR
It is shown that the model outperforms the existing RNN models with respect to discovering ‘open-vocabulary’ slots without any external information, such as a named entity database or knowledge base, and demonstrates superior performance with regard to intent detection.

Joint Slot Filling and Intent Detection in Spoken Language Understanding by Hybrid CNN-LSTM Model

TLDR
A novel model is proposed that combines between convolutional neural networks, for their ability to detect complex features in the input sequences by applying filters to frames of these inputs, and recurrent neural networks taking in account the fact, that they can keep track of the long- and short- term dependencies in theinput sequences.

A Comparison of Deep Learning Methods for Language Understanding

TLDR
It is demonstrated that neural networks without feature engineering outperform state-of-the-art statistical and deep learning approaches on all three tasks (except written meal descriptions, where the CRF is slightly better) and that deep, attention-based BERT, in particular, surpasses state- of- the-art results on these tasks.

Label-Dependencies Aware Recurrent Neural Networks

TLDR
This work proposes an evolution of the simple Jordan RNN, where labels are re-injected as input into the network, and converted into embeddings, in the same way as words, which is more effective than other RNNs, but also outperforms sophisticated CRF models.
...

References

SHOWING 1-10 OF 72 REFERENCES

Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding

TLDR
The results show that on this task, both types of recurrent networks outperform the CRF baseline substantially, and a bi-directional Jordantype network that takes into account both past and future dependencies among slots works best, outperforming a CRFbased baseline by 14% in relative error reduction.

Context dependent recurrent neural network language model

TLDR
This paper improves recurrent neural network language models performance by providing a contextual real-valued input vector in association with each word to convey contextual information about the sentence being modeled by performing Latent Dirichlet Allocation using a block of preceding text.

Recurrent neural networks for language understanding

TLDR
This paper modify the architecture to perform Language Understanding, and advance the state-of-the-art for the widely used ATIS dataset.

Recurrent conditional random field for language understanding

TLDR
This paper shows that the performance of an RNN tagger can be significantly improved by incorporating elements of the CRF model; specifically, the explicit modeling of output-label dependencies with transition features, its global sequence-level objective function, and offline decoding.

Spoken language understanding using long short-term memory neural networks

TLDR
This paper investigates using long short-term memory (LSTM) neural networks, which contain input, output and forgetting gates and are more advanced than simple RNN, for the word labeling task and proposes a regression model on top of the LSTM un-normalized scores to explicitly model output-label dependence.

Generative and discriminative algorithms for spoken language understanding

TLDR
Generative and discriminative approaches to modeling the sentence segmentation and concept labeling are studied and it is shown how non-local non-lexical features (e.g. a-priori knowledge) can be modeled with CRF which is the best performing algorithm across tasks.

Sentence simplification for spoken language understanding

TLDR
A dependency parsing-based sentence simplification approach that extracts a set of keywords from natural language sentences and uses those in addition to entire utterances for completing SLU tasks is proposed.

Training Neural Network Language Models on Very Large Corpora

TLDR
New algorithms to train a neural network language model on very large text corpora are presented, making possible the use of the approach in domains where several hundreds of millions words of texts are available.

What is left to be understood in ATIS?

TLDR
It is concluded that even with such low error rates, ATIS test set still includes many unseen example categories and sequences, hence requires more data, and new annotated larger data sets from more complex tasks with realistic utterances can avoid over-tuning in terms of modeling and feature design.

Speech recognition with deep recurrent neural networks

TLDR
This paper investigates deep recurrent neural networks, which combine the multiple levels of representation that have proved so effective in deep networks with the flexible use of long range context that empowers RNNs.
...