• Corpus ID: 37038698

Effective Word Representation for Named Entity Recognition

  title={Effective Word Representation for Named Entity Recognition},
  author={Jun-Ting Hsieh},
Recently, various machine learning models have been built using word-level embeddings and have achieved substantial improvement in NER prediction accuracy. Most NER models only take words as input and ignore character-level information. In this paper, we propose an effective word representation that efficiently includes both the word-level and character-level information by averaging its character n-gram embeddings. Our best performing model uses a bidirectional LSTM with word and character n… 

Figures and Tables from this paper



Lexicon Infused Phrase Embeddings for Named Entity Resolution

A new form of learning word embeddings that can leverage information from relevant lexicons to improve the representations, and the first system to use neural word embedDings to achieve state-of-the-art results on named-entity recognition in both CoNLL and Ontonotes NER are presented.

Boosting Named Entity Recognition with Neural Character Embeddings

This work proposes a language-independent NER system that uses automatically learned features only and demonstrates that the same neural network which has been successfully applied to POS tagging can also achieve state-of-the-art results for language-independet NER, using the same hyperparameters, and without any handcrafted features.

Named Entity Recognition with Bidirectional LSTM-CNNs

A novel neural network architecture is presented that automatically detects word- and character-level features using a hybrid bidirectional LSTM and CNN architecture, eliminating the need for most feature engineering.

Character-Aware Neural Language Models

A simple neural language model that relies only on character-level inputs that is able to encode, from characters only, both semantic and orthographic information and suggests that on many languages, character inputs are sufficient for language modeling.

GloVe: Global Vectors for Word Representation

A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.

Multilingual Language Processing From Bytes

An LSTM-based model that reads text as bytes and outputs span annotations of the form [start, length, label] where start positions, lengths, and labels are separate entries in the authors' vocabulary is described.

Distributed Representations of Sentences and Documents

Paragraph Vector is an unsupervised algorithm that learns fixed-length feature representations from variable-length pieces of texts, such as sentences, paragraphs, and documents, and its construction gives the algorithm the potential to overcome the weaknesses of bag-of-words models.

Distributed Representations of Words and Phrases and their Compositionality

This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.

Character-level Convolutional Networks for Text Classification

This article constructed several large-scale datasets to show that character-level convolutional networks could achieve state-of-the-art or competitive results in text classification.

Factored Neural Language Models

A new type of neural probabilistic language model is presented that learns a mapping from both words and explicit word features into a continuous space that is then used for word prediction and significantly reduces perplexity on sparse-data tasks.