• Corpus ID: 19232497

Empower Sequence Labeling with Task-Aware Neural Language Model

@inproceedings{Liu2018EmpowerSL,
  title={Empower Sequence Labeling with Task-Aware Neural Language Model},
  author={Liyuan Liu and Jingbo Shang and Frank F. Xu and Xiang Ren and Huan Gui and Jian Peng and Jiawei Han},
  booktitle={AAAI},
  year={2018}
}
Linguistic sequence labeling is a general modeling approach that encompasses a variety of problems, such as part-of-speech tagging and named entity recognition. [] Key Method Besides word-level knowledge contained in pre-trained word embeddings, character-aware neural language models are incorporated to extract character-level knowledge. Transfer learning techniques are further adopted to mediate different components and guide the language model towards the key knowledge. Comparing to previous methods, these…

Figures and Tables from this paper

Improving Neural Sequence Labelling Using Additional Linguistic Information
TLDR
This study proposes a method to adding various linguistic features to the neural sequence framework to improve sequence labelling, and achieves state of the art results on the benchmark datasets of POS, NER, and chunking.
A Survey on Recent Advances in Sequence Labeling from Deep Learning Models
TLDR
This paper presents a comprehensive review of existing deep learning-based sequence labeling models, which consists of three related tasks, e.g., part-of-speech tagging, named entity recognition, and text chunking, and systematically presents the existing approaches base on a scientific taxonomy.
Learning Named Entity Tagger using Domain-Specific Dictionary
TLDR
After identifying the nature of noisy labels in distant supervision, a novel, more effective neural model AutoNER is proposed with a new Tie or Break scheme and how to refine distant supervision for better NER performance is discussed.
Enhancing Neural Sequence Labeling with Position-Aware Self-Attention
TLDR
An innovative and well-designed attention-based model is proposed (called position-aware self-attention, i.e., PSA) within a neural network architecture, to explore the positional information of an input sequence for capturing the latent relations among tokens.
Domain-aware Neural Model for Sequence Labeling using Joint Learning
TLDR
An innovative joint learning neural network is proposed which can encapsulate the global domain knowledge and the local sentence/token information to enhance the sequence labeling model and outperforms classical and most recent state-of-the-art labeling methods.
Semi-Supervised Sequence Modeling with Cross-View Training
TLDR
Cross-View Training (CVT), a semi-supervised learning algorithm that improves the representations of a Bi-LSTM sentence encoder using a mix of labeled and unlabeled data, is proposed and evaluated, achieving state-of-the-art results.
Learning Better Internal Structure of Words for Sequence Labeling
TLDR
This work proposes IntNet, a funnel-shaped wide convolutional neural architecture with no down-sampling for learning representations of the internal structure of words by composing their characters from limited, supervised training corpora.
Combining neural and knowledge-based approaches to Named Entity Recognition in Polish
TLDR
This work shows that combining neural NER model and entity linking model with a knowledge base is more effective in recognizing named entities than using Ner model alone.
Gated Task Interaction Framework for Multi-task Sequence Tagging
TLDR
This paper presents an approach to jointly learn linguistic features along with the target sequence labelling tasks with a new multi-task learning (MTL) framework called Gated Tasks Interaction (GTI) network for solving multiple sequence tagging tasks.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 39 REFERENCES
Semi-supervised sequence tagging with bidirectional language models
TLDR
A general semi-supervised approach for adding pre- trained context embeddings from bidirectional language models to NLP systems and apply it to sequence labeling tasks, surpassing previous systems that use other forms of transfer or joint learning with additional labeled data and task specific gazetteers.
End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF
TLDR
A novel neutral network architecture is introduced that benefits from both word- and character-level representations automatically, by using combination of bidirectional LSTM, CNN and CRF, thus making it applicable to a wide range of sequence labeling tasks.
Named Entity Recognition with Bidirectional LSTM-CNNs
TLDR
A novel neural network architecture is presented that automatically detects word- and character-level features using a hybrid bidirectional LSTM and CNN architecture, eliminating the need for most feature engineering.
Transfer Learning for Sequence Tagging with Hierarchical Recurrent Networks
TLDR
The effects of transfer learning for deep hierarchical recurrent networks across domains, applications, and languages are examined, and it is shown that significant improvement can often be obtained.
Semi-supervised Multitask Learning for Sequence Labeling
TLDR
A sequence labeling framework with a secondary training objective, learning to predict surrounding words for every word in the dataset, which incentivises the system to learn general-purpose patterns of semantic and syntactic composition, useful for improving accuracy on different sequence labeling tasks.
Natural Language Processing (Almost) from Scratch
We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including part-of-speech tagging, chunking, named entity
Heterogeneous Supervision for Relation Extraction: A Representation Learning Approach
TLDR
This work proposes a novel framework, REHession, to conduct relation extractor learning using annotations from heterogeneous information source, e.g., knowledge base and domain heuristics, and adopts embedding techniques to learn the distributed representations of context.
VecShare: A Framework for Sharing Word Representation Vectors
TLDR
This work presents a framework, called VecShare, that makes it easy to share and retrieve word embeddings on the Web, and performs an experimental evaluation of VecShare’s similarity strategies, and shows that they are effective at efficiently retrieving embeddeddings that boost accuracy in a document classification task.
Early results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons
TLDR
This work has shown that conditionally-trained models, such as conditional maximum entropy models, handle inter-dependent features of greedy sequence modeling in NLP well.
A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks
TLDR
A joint many-task model together with a strategy for successively growing its depth to solve increasingly complex tasks and uses a simple regularization term to allow for optimizing all model weights to improve one task’s loss without exhibiting catastrophic interference of the other tasks.
...
1
2
3
4
...