Exploring Named Entity Recognition As an Auxiliary Task for Slot Filling in Conversational Language Understanding

  title={Exploring Named Entity Recognition As an Auxiliary Task for Slot Filling in Conversational Language Understanding},
  author={Samuel Louvan and Bernardo Magnini},
Slot filling is a crucial task in the Natural Language Understanding (NLU) component of a dialogue system. Most approaches for this task rely solely on the domain-specific datasets for training. We propose a joint model of slot filling and Named Entity Recognition (NER) in a multi-task learning (MTL) setup. Our experiments on three slot filling datasets show that using NER as an auxiliary task improves slot filling performance and achieve competitive performance compared with state-of-the-art… 

Figures and Tables from this paper

Auxiliary Capsules for Natural Language Understanding

This work extends the newly introduced application of Capsule Networks for NLU to a multi-task learning environment, using relevant auxiliary tasks, and performs joint Intent classification and Slot filling with the aid of Named Entity Recognition (NER) and Part of Speech (POS) tagging tasks.

AISFG: Abundant Information Slot Filling Generator

This work proposes Abundant Information Slot Filling Generator (AISFG), a generative model with a novel query template that incorporates domain descriptions, slot descriptions, and examples with context that outperforms state-of-the-art approaches in zero/few-shot slot filling task.

A Two-stage Model for Slot Filling in Low-resource Settings: Domain-agnostic Non-slot Reduction and Pretrained Contextual Embeddings

This paper proposes a novel two-stage model architecture that can be trained with only a few in-domain hand-labeled examples that outperforms other state-of-art systems on the SNIPS benchmark dataset.

On the Use of External Data for Spoken Named Entity Recognition

This work considers self-training, knowledge distillation, and transfer learning for end-to-end (E2E) and pipeline (speech recognition followed by text NER) approaches and finds that several of these approaches improve performance in resource-constrained settings beyond the benefits from pre-trained representations.

Clustering-Based Sequence to Sequence Model for Generative Question Answering in a Low-Resource Language

This paper presents a framework which exploits the semantic clusters among the question-answer pairs to compensate for the lack of enough training data, and outperforms the standard sequence to sequence model by a large margin in terms of ROUGE and BLEU scores.

FASTDial: Abstracting Dialogue Policies for Fast Development of Task Oriented Agents

A novel abstraction framework called FASTDial for designing task oriented dialogue agents, built on top of the OpenDial toolkit, that allows for minimizing programming effort and domain expert training time, by hiding away many implementation details.

Polarity enriched attention network for aspect-based sentiment analysis

Methods to enhance sentiment granularity at the aspect-level by using the proposed PEAN model, which substantially improves the performance of ABSA on employed datasets and is illustrated with two benchmark sentiment datasets.

To BERT or Not to BERT: Comparing Task-specific and Task-agnostic Semi-Supervised Approaches for Sequence Tagging

This work investigates how to effectively use unlabeled data by exploring the task-specific semi-supervised approach, Cross-View Training (CVT) and comparing it with task-agnostic BERT in multiple settings that include domain and task relevant English data.

Neural MOS Prediction for Synthesized Speech Using Multi-Task Learning with Spoofing Detection and Spoofing Type Classification

A multi-task learning (MTL) method to improve the performance of a MOS prediction model using the following two auxiliary tasks: spoofing detection (SD) and spoofing type classification (STC).

A Survey of Joint Intent Detection and Slot Filling Models in Natural Language Understanding

This survey brings the coverage of methods up to 2021 including the many applications of deep learning in the field and looks at issues addressed in the joint task and the approaches designed to address these issues.



Bag of Experts Architectures for Model Reuse in Conversational Language Understanding

This work describes Bag of Experts (BoE) architectures for model reuse for both LSTM and CRF based models for slot tagging and shows that these models outperform the baseline models with a statistically significant average margin of 5.06% in absolute F1-score.

Multi-Domain Joint Semantic Frame Parsing Using Bi-Directional RNN-LSTM

Experimental results show the power of a holistic multi-domain, multi-task modeling approach to estimate complete semantic frames for all user utterances addressed to a conversational system over alternative methods based on single domain/task deep learning.

A Bi-Model Based RNN Semantic Frame Parsing Model for Intent Detection and Slot Filling

New Bi-model based RNN semantic frame parsing network structures are designed to perform the intent detection and slot filling tasks jointly, by considering their cross-impact to each other using two correlated bidirectional LSTMs (BLSTM).

Domain Adaptation of Recurrent Neural Networks for Natural Language Understanding

The proposed multi-task model delivers better performance with less data by leveraging patterns that it learns from the other tasks, and supports an open vocabulary, which allows the models to generalize to unseen words.

Slot-Gated Modeling for Joint Slot Filling and Intent Prediction

A slot gate that focuses on learning the relationship between intent and slot attention vectors in order to obtain better semantic frame results by the global optimization is proposed.

New Transfer Learning Techniques for Disparate Label Sets

This work proposes a solution based on label embeddings induced from canonical correlation analysis (CCA) that reduces the problem to a standard domain adaptation task and allows use of a number of transfer learning techniques.

End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF

A novel neutral network architecture is introduced that benefits from both word- and character-level representations automatically, by using combination of bidirectional LSTM, CNN and CRF, thus making it applicable to a wide range of sequence labeling tasks.

What is left to be understood in ATIS?

It is concluded that even with such low error rates, ATIS test set still includes many unseen example categories and sequences, hence requires more data, and new annotated larger data sets from more complex tasks with realistic utterances can avoid over-tuning in terms of modeling and feature design.

Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling

This work proposes an attention-based neural network model for joint intent detection and slot filling, both of which are critical steps for many speech understanding and dialog systems.

Multi-Domain Adversarial Learning for Slot Filling in Spoken Language Understanding

It is shown that adversarial training helps in learning better domain-general SLU models, leading to improved slot filling F1 scores, and applying adversarial learning on domain- general model also helps in achieving higher slot filling performance when the model is jointly optimized with domain-specific models.