End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF

@article{Ma2016EndtoendSL,
  title={End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF},
  author={Xuezhe Ma and Eduard H. Hovy},
  journal={ArXiv},
  year={2016},
  volume={abs/1603.01354}
}
State-of-the-art sequence labeling systems traditionally require large amounts of task-specific knowledge in the form of hand-crafted features and data pre-processing. [] Key Result We obtain state-of-the-art performance on both the two data --- 97.55\% accuracy for POS tagging and 91.21\% F1 for NER.

Figures and Tables from this paper

A Two-Stage Deep Neural Network for Sequence Labeling

A two-stage deep neural network architecture is proposed for sequence labeling, which enable the higher-layer to make use of the coarse-grained labeling information of the lower-level of the model.

Bidirectional LSTM-CNNs-CRF Models for POS Tagging

A discriminative word embedding, character embedding and byte pair encoding (BPE) hybrid neural network architecture to implement a true end-to-end system without feature engineering and data pre-processing for part-of-speech(POS) tagging is presented.

Learning Task-specific Representation for Novel Words in Sequence Labeling

This work proposes a novel method to predict representations for OOV words from their surface-forms and contexts and shows that the proposed method can achieve better or competitive performance on the OOV problem compared with existing state-of-the-art methods.

Neural Joint Model for Part-of-Speech Tagging and Entity Extraction

A neural joint model based on a bidirectional long-short term memory (BiLSTM) network and adversarial transfer learning to incorporate syntactic information from two tasks by using task-shared information is proposed.

A Survey on Recent Advances in Sequence Labeling from Deep Learning Models

This paper presents a comprehensive review of existing deep learning-based sequence labeling models, which consists of three related tasks, e.g., part-of-speech tagging, named entity recognition, and text chunking, and systematically presents the existing approaches base on a scientific taxonomy.

Bidirectional LSTM-CRF for Named Entity Recognition

This work is the first to experiment BI-CRF in neural architectures for sequence labeling task and it is shown that CRF can be extended to capture the dependencies between labels in both right and left directions of the sequence.

Learning Context Using Segment-Level LSTM for Neural Sequence Labeling

The proposed model enhances the performance of tasks for finding appropriate labels of multiple token segments by employing an additional segment-level long short-term memory (LSTM) that trains features by learning adjacent context in a segment.

Improved Named Entity Recognition for Noisy Call Center Transcripts

This work proposes a set of models which utilize state-of-the-art Transformer language models (RoBERTa) to develop a high-accuracy NER system trained on custom annotated set of call center transcripts and proposes a new general annotation scheme for NER in the call-center environment.

Empower Sequence Labeling with Task-Aware Neural Language Model

This study develops a neural framework to extract knowledge from raw texts and empower the sequence labeling task, and leverages character-level knowledge from self-contained order information of training sequences.
...

References

SHOWING 1-10 OF 64 REFERENCES

Boosting Named Entity Recognition with Neural Character Embeddings

This work proposes a language-independent NER system that uses automatically learned features only and demonstrates that the same neural network which has been successfully applied to POS tagging can also achieve state-of-the-art results for language-independet NER, using the same hyperparameters, and without any handcrafted features.

Named Entity Recognition with Bidirectional LSTM-CNNs

A novel neural network architecture is presented that automatically detects word- and character-level features using a hybrid bidirectional LSTM and CNN architecture, eliminating the need for most feature engineering.

Learning Character-level Representations for Part-of-Speech Tagging

A deep neural network is proposed that learns character-level representation of words and associate them with usual word representations to perform POS tagging and produces state-of-the-art POS taggers for two languages.

Bidirectional LSTM-CRF Models for Sequence Tagging

This work is the first to apply a bidirectional LSTM CRF model to NLP benchmark sequence tagging data sets and it is shown that the BI-LSTM-CRF model can efficiently use both past and future input features thanks to a biddirectional L STM component.

Lexicon Infused Phrase Embeddings for Named Entity Resolution

A new form of learning word embeddings that can leverage information from relevant lexicons to improve the representations, and the first system to use neural word embedDings to achieve state-of-the-art results on named-entity recognition in both CoNLL and Ontonotes NER are presented.

Multi-Task Cross-Lingual Sequence Tagging from Scratch

A deep hierarchical recurrent neural network for sequence tagging that employs deep gated recurrent units on both character and word levels to encode morphology and context information, and applies a conditional random field layer to predict the tags.

Improving Named Entity Recognition for Chinese Social Media with Word Segmentation Representation Learning

It is shown that new state-of-the-art word segmentation systems use neural models to learn representations for predicting word boundaries, and these same representations, jointly trained with an NER system, yield significant improvements in NER for Chinese social media.

Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network

A new part-of-speech tagger is presented that demonstrates the following ideas: explicit use of both preceding and following tag contexts via a dependency network representation, broad use of lexical features, and effective use of priors in conditional loglinear models.

Non-lexical neural architecture for fine-grained POS Tagging

Experimental results show that the convolutional network can infer meaningful word representations, while for the prediction stage, a well designed and structured strategy allows the model to outperform stateof-the-art results, without any feature engineering.

Named Entity Recognition for Chinese Social Media with Jointly Trained Embeddings

A new corpus of Weibo messages annotated for both name and nominal mentions is presented and a joint training objective for the embeddings that makes use of both (NER) labeled and unlabeled raw text is proposed.
...