Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding

@article{Mesnil2015UsingRN,
  title={Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding},
  author={Gr{\'e}goire Mesnil and Yann Dauphin and Kaisheng Yao and Yoshua Bengio and Li Deng and Dilek Z. Hakkani-T{\"u}r and Xiaodong He and Larry Heck and Gokhan Tur and Dong Yu and Geoffrey Zweig},
  journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing},
  year={2015},
  volume={23},
  pages={530-539}
}
Semantic slot filling is one of the most challenging problems in spoken language understanding (SLU). In this paper, we propose to use recurrent neural networks (RNNs) for this task, and present several novel architectures designed to efficiently model past and future temporal dependencies. Specifically, we implemented and compared several important RNN architectures, including Elman, Jordan, and hybrid variants. To facilitate reproducibility, we implemented these networks with the publicly… Expand
Effective Spoken Language Labeling with Deep Recurrent Neural Networks
TLDR
This paper designs new deep RNNs architectures for Spoken Language Understanding achieving state-of-the-art results on two widely used corpora for SLU: ATIS, in English, and MEDIA, in French. Expand
Label-Dependency Coding in Simple Recurrent Networks for Spoken Language Understanding
TLDR
It is shown that modeling label dependencies is useless on the (simple) ATIS database and unstructured models can produce state-of-the-art results on this benchmark, and that the modification introduced in the proposed RNN outperforms traditional RNNs and CRF models on the MEDIA benchmark. Expand
Using Deep Time Delay Neural Network for Slot Filling in Spoken Language Understanding
TLDR
This work proposes to use a multi-layer Time Delay Neural Network (TDNN), which is prevalent in current large vocabulary continuous speech recognition tasks, and uses a TDNN with symmetric time delay offset, which achieves state-of-the-art results without any additional knowledge and data resources. Expand
ClockWork-RNN Based Architectures for Slot Filling
TLDR
This paper proposes the use of ClockWork recurrent neural network (CW-RNN) architectures in the slot-filling domain and finds that they significantly outperform simple RNNs and they achieve results among the state-of-the-art, while retaining smaller complexity. Expand
Recurrent Neural Network Structured Output Prediction for Spoken Language Understanding
Recurrent Neural Networks (RNNs) have been widely used for sequence modeling due to their strong capabilities in modeling temporal dependencies. In this work, we study and evaluate the effectivenessExpand
Exploring the use of Attention-Based Recurrent Neural Networks For Spoken Language Understanding
TLDR
First experiments carried on the ATIS corpus confirm the quality of the RNN baseline system used in this paper, and show that RNN based on a mechanism of attention performs better than RNN architectures recently proposed for a slot filling task. Expand
Modeling with Recurrent Neural Networks for Open Vocabulary Slots
TLDR
It is shown that the model outperforms the existing RNN models with respect to discovering ‘open-vocabulary’ slots without any external information, such as a named entity database or knowledge base, and demonstrates superior performance with regard to intent detection. Expand
A Comparison of Deep Learning Methods for Language Understanding
TLDR
It is demonstrated that neural networks without feature engineering outperform state-of-the-art statistical and deep learning approaches on all three tasks (except written meal descriptions, where the CRF is slightly better) and that deep, attention-based BERT, in particular, surpasses state- of- the-art results on these tasks. Expand
Joint Slot Filling and Intent Detection in Spoken Language Understanding by Hybrid CNN-LSTM Model
We investigate the usage of hybrid convolutional and long- short-term memory neural networks for joint slot filling and intent detection in spoken language understanding. We propose a novel modelExpand
Label-Dependencies Aware Recurrent Neural Networks
TLDR
This work proposes an evolution of the simple Jordan RNN, where labels are re-injected as input into the network, and converted into embeddings, in the same way as words, which is more effective than other RNNs, but also outperforms sophisticated CRF models. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 81 REFERENCES
Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding
TLDR
The results show that on this task, both types of recurrent networks outperform the CRF baseline substantially, and a bi-directional Jordantype network that takes into account both past and future dependencies among slots works best, outperforming a CRFbased baseline by 14% in relative error reduction. Expand
Context dependent recurrent neural network language model
TLDR
This paper improves recurrent neural network language models performance by providing a contextual real-valued input vector in association with each word to convey contextual information about the sentence being modeled by performing Latent Dirichlet Allocation using a block of preceding text. Expand
Recurrent neural networks for language understanding
TLDR
This paper modify the architecture to perform Language Understanding, and advance the state-of-the-art for the widely used ATIS dataset. Expand
Recurrent conditional random field for language understanding
TLDR
This paper shows that the performance of an RNN tagger can be significantly improved by incorporating elements of the CRF model; specifically, the explicit modeling of output-label dependencies with transition features, its global sequence-level objective function, and offline decoding. Expand
Spoken language understanding using long short-term memory neural networks
TLDR
This paper investigates using long short-term memory (LSTM) neural networks, which contain input, output and forgetting gates and are more advanced than simple RNN, for the word labeling task and proposes a regression model on top of the LSTM un-normalized scores to explicitly model output-label dependence. Expand
Deep belief network based semantic taggers for spoken language understanding
TLDR
It is observed that when discrete features are projected onto a continuous space during neural network training, the model learns to cluster these features leading to its improved generalization capability, relative to a CRF model, especially in cases where some features are either missing or noisy. Expand
Generative and discriminative algorithms for spoken language understanding
TLDR
Generative and discriminative approaches to modeling the sentence segmentation and concept labeling are studied and it is shown how non-local non-lexical features (e.g. a-priori knowledge) can be modeled with CRF which is the best performing algorithm across tasks. Expand
Sentence simplification for spoken language understanding
TLDR
A dependency parsing-based sentence simplification approach that extracts a set of keywords from natural language sentences and uses those in addition to entire utterances for completing SLU tasks is proposed. Expand
Training Neural Network Language Models on Very Large Corpora
TLDR
New algorithms to train a neural network language model on very large text corpora are presented, making possible the use of the approach in domains where several hundreds of millions words of texts are available. Expand
What is left to be understood in ATIS?
TLDR
It is concluded that even with such low error rates, ATIS test set still includes many unseen example categories and sequences, hence requires more data, and new annotated larger data sets from more complex tasks with realistic utterances can avoid over-tuning in terms of modeling and feature design. Expand
...
1
2
3
4
5
...