• Corpus ID: 37038698

Effective Word Representation for Named Entity Recognition

@inproceedings{Hsieh2017EffectiveWR,
  title={Effective Word Representation for Named Entity Recognition},
  author={Jun-Ting Hsieh},
  year={2017}
}
Recently, various machine learning models have been built using word-level embeddings and have achieved substantial improvement in NER prediction accuracy. Most NER models only take words as input and ignore character-level information. In this paper, we propose an effective word representation that efficiently includes both the word-level and character-level information by averaging its character n-gram embeddings. Our best performing model uses a bidirectional LSTM with word and character n… 

Figures and Tables from this paper

References

SHOWING 1-10 OF 16 REFERENCES
Lexicon Infused Phrase Embeddings for Named Entity Resolution
TLDR
A new form of learning word embeddings that can leverage information from relevant lexicons to improve the representations, and the first system to use neural word embedDings to achieve state-of-the-art results on named-entity recognition in both CoNLL and Ontonotes NER are presented.
Boosting Named Entity Recognition with Neural Character Embeddings
TLDR
This work proposes a language-independent NER system that uses automatically learned features only and demonstrates that the same neural network which has been successfully applied to POS tagging can also achieve state-of-the-art results for language-independet NER, using the same hyperparameters, and without any handcrafted features.
Named Entity Recognition with Bidirectional LSTM-CNNs
TLDR
A novel neural network architecture is presented that automatically detects word- and character-level features using a hybrid bidirectional LSTM and CNN architecture, eliminating the need for most feature engineering.
GloVe: Global Vectors for Word Representation
TLDR
A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.
Multilingual Language Processing From Bytes
TLDR
An LSTM-based model that reads text as bytes and outputs span annotations of the form [start, length, label] where start positions, lengths, and labels are separate entries in the authors' vocabulary is described.
Distributed Representations of Sentences and Documents
TLDR
Paragraph Vector is an unsupervised algorithm that learns fixed-length feature representations from variable-length pieces of texts, such as sentences, paragraphs, and documents, and its construction gives the algorithm the potential to overcome the weaknesses of bag-of-words models.
Distributed Representations of Words and Phrases and their Compositionality
TLDR
This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
Character-level Convolutional Networks for Text Classification
TLDR
This article constructed several large-scale datasets to show that character-level convolutional networks could achieve state-of-the-art or competitive results in text classification.
Factored Neural Language Models
TLDR
A new type of neural probabilistic language model is presented that learns a mapping from both words and explicit word features into a continuous space that is then used for word prediction and significantly reduces perplexity on sparse-data tasks.
A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks
TLDR
A joint many-task model together with a strategy for successively growing its depth to solve increasingly complex tasks and uses a simple regularization term to allow for optimizing all model weights to improve one task’s loss without exhibiting catastrophic interference of the other tasks.
...
...