• Corpus ID: 37038698

Effective Word Representation for Named Entity Recognition

@inproceedings{Hsieh2017EffectiveWR,
  title={Effective Word Representation for Named Entity Recognition},
  author={Jun-Ting Hsieh},
  year={2017}
}
Recently, various machine learning models have been built using word-level embeddings and have achieved substantial improvement in NER prediction accuracy. Most NER models only take words as input and ignore character-level information. In this paper, we propose an effective word representation that efficiently includes both the word-level and character-level information by averaging its character n-gram embeddings. Our best performing model uses a bidirectional LSTM with word and character n… 

Figures and Tables from this paper

References

SHOWING 1-10 OF 16 REFERENCES

Lexicon Infused Phrase Embeddings for Named Entity Resolution

TLDR
A new form of learning word embeddings that can leverage information from relevant lexicons to improve the representations, and the first system to use neural word embedDings to achieve state-of-the-art results on named-entity recognition in both CoNLL and Ontonotes NER are presented.

Boosting Named Entity Recognition with Neural Character Embeddings

TLDR
This work proposes a language-independent NER system that uses automatically learned features only and demonstrates that the same neural network which has been successfully applied to POS tagging can also achieve state-of-the-art results for language-independet NER, using the same hyperparameters, and without any handcrafted features.

Named Entity Recognition with Bidirectional LSTM-CNNs

TLDR
A novel neural network architecture is presented that automatically detects word- and character-level features using a hybrid bidirectional LSTM and CNN architecture, eliminating the need for most feature engineering.

Character-Aware Neural Language Models

TLDR
A simple neural language model that relies only on character-level inputs that is able to encode, from characters only, both semantic and orthographic information and suggests that on many languages, character inputs are sufficient for language modeling.

GloVe: Global Vectors for Word Representation

TLDR
A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.

Multilingual Language Processing From Bytes

TLDR
An LSTM-based model that reads text as bytes and outputs span annotations of the form [start, length, label] where start positions, lengths, and labels are separate entries in the authors' vocabulary is described.

Distributed Representations of Sentences and Documents

TLDR
Paragraph Vector is an unsupervised algorithm that learns fixed-length feature representations from variable-length pieces of texts, such as sentences, paragraphs, and documents, and its construction gives the algorithm the potential to overcome the weaknesses of bag-of-words models.

Distributed Representations of Words and Phrases and their Compositionality

TLDR
This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.

Character-level Convolutional Networks for Text Classification

TLDR
This article constructed several large-scale datasets to show that character-level convolutional networks could achieve state-of-the-art or competitive results in text classification.

Factored Neural Language Models

TLDR
A new type of neural probabilistic language model is presented that learns a mapping from both words and explicit word features into a continuous space that is then used for word prediction and significantly reduces perplexity on sparse-data tasks.