Boosting Named Entity Recognition with Neural Character Embeddings

  title={Boosting Named Entity Recognition with Neural Character Embeddings},
  author={C{\'i}cero Nogueira dos Santos and Victor Guimar{\~a}es},
Most state-of-the-art named entity recognition (NER) systems rely on handcrafted features and on the output of other NLP tasks such as part-of-speech (POS) tagging and text chunking. In this work we propose a language-independent NER system that uses automatically learned features only. Our approach is based on the CharWNN deep neural network, which uses word-level and character-level representations (embeddings) to perform sequential classification. We perform an extensive number of… 

Figures and Tables from this paper

Assessing the Impact of Contextual Embeddings for Portuguese Named Entity Recognition

The best NER system outperforms the state-of-the-art in Portuguese NER by 5.99 in absolute percentage points and a comparative study of 16 different combinations of shallow and contextual embeddings is shown.

Named Entity Recognition for Spoken Finnish

A Bidirectional LSTM neural network with a Conditional Random Field layer on top, which utilizes word, character and morph embeddings in order to perform named entity recognition on various Finnish datasets is presented.

Character-based Bidirectional LSTM-CRF with words and characters for Japanese Named Entity Recognition

Proposing a neural model for predicting a tag for each character using word and character information and demonstrating that this model outperforms the state-of-the-art neural English NER model in Japanese.

Simultaneous Tagging of Named Entities and Parts-of-Speech for Portuguese and Spanish Texts

This work collected and standardized a wide variety of datasets containing text in Portuguese and Spanish, annotated according to parts-of-speech and/or named entities, and evaluated a modern architecture for sequence labeling, considering transfer learning approaches based on multi-task learning and cross-lingual learning.

Neural Named Entity Recognition from Subword Units

A neural solution based on bidirectional LSTMs and conditional random fields, where the model relies on subword units, namely characters, phonemes, and bytes, to identify mentions of named entities in text e.g., from transcribed speech.

microNER: A Micro-Service for German Named Entity Recognition based on BiLSTM-CRF

This work evaluates the performance of different word and character embeddings on two standard German datasets and with a special focus on out-of-vocabulary words and publishes several pre-trained models wrapped into a micro-service based on Docker to allow for easy integration of German NER into other applications via a JSON API.

Named Entity Recognition with Bidirectional LSTM-CNNs

A novel neural network architecture is presented that automatically detects word- and character-level features using a hybrid bidirectional LSTM and CNN architecture, eliminating the need for most feature engineering.

Exploring the importance of context and embeddings in neural NER models for task-oriented dialogue systems

An array of experiments with different combinations of including the previous utterance in the dialogue as a source of additional features and using word and character level embeddings trained on a larger external corpus are performed.

CAN-NER: Convolutional Attention Network for Chinese Named Entity Recognition

A Convolutional Attention Network called CAN for Chinese NER is investigated, which consists of a character-based convolutional neural network with local-attention layer and a gated recurrent unit with global self-att attention layer to capture the information from adjacent characters and sentence contexts.



Lexicon Infused Phrase Embeddings for Named Entity Resolution

A new form of learning word embeddings that can leverage information from relevant lexicons to improve the representations, and the first system to use neural word embedDings to achieve state-of-the-art results on named-entity recognition in both CoNLL and Ontonotes NER are presented.

Learning Character-level Representations for Part-of-Speech Tagging

A deep neural network is proposed that learns character-level representation of words and associate them with usual word representations to perform POS tagging and produces state-of-the-art POS taggers for two languages.

Training State-of-the-Art Portuguese POS Taggers without Handcrafted Features

This work tackles Portuguese POS tagging using a deep neural network that employs a convolutional layer to learn character-level representation of words and produces state-of-the-art POS taggers for three corpora.

Evaluating Word Representation Features in Biomedical Named Entity Recognition Tasks

This paper systematically investigated three different types of word representation (WR) features for BNER, including clustering-based representation, distributional representation, and word embeddings, and showed that all the three WR algorithms were beneficial to machine learning-based BNER systems.

Machine Learning Algorithms for Portuguese Named Entity Recognition

The results suggest that Machine Learning can be useful in Portuguese NER and indicate that HMM, TBL and SVM perform well in this natural language processing task.

Natural Language Processing (Almost) from Scratch

We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including part-of-speech tagging, chunking, named entity

Using Deep Belief Nets for Chinese Named Entity Categorization

This paper proposes a novel approach, Deep Belief Nets (DBN), for the Chinese entity mention categorization problem, which has very strong representation power and it is able to elaborately self-train for discovering complicated feature combinations.

Deep Learning for Efficient Discriminative Parsing

We propose a new fast purely discriminative algorithm for natural language parsing, based on a “deep” recurrent convolutional graph transformer network (GTN). Assuming a decomposition of a parse tree

Efficient Estimation of Word Representations in Vector Space

Two novel model architectures for computing continuous vector representations of words from very large data sets are proposed and it is shown that these vectors provide state-of-the-art performance on the authors' test set for measuring syntactic and semantic word similarities.

The Automatic Content Extraction (ACE) Program - Tasks, Data, and Evaluation

The objective of the ACE program is to develop technology to automatically infer from human language data the entities being mentioned, the relations among these entities that are directly expressed,