Local Additivity Based Data Augmentation for Semi-supervised NER

  title={Local Additivity Based Data Augmentation for Semi-supervised NER},
  author={Jiaao Chen and Zhenghui Wang and Ran Tian and Zichao Yang and Diyi Yang},
Named Entity Recognition (NER) is one of the first stages in deep language understanding yet current NER models heavily rely on human-annotated data. In this work, to alleviate the dependence on labeled data, we propose a Local Additivity based Data Augmentation (LADA) method for semi-supervised NER, in which we create virtual samples by interpolating sequences close to each other. Our approach has two variations: Intra-LADA and Inter-LADA, where Intra-LADA performs interpolations among tokens… Expand

Figures and Tables from this paper

FlipDA: Effective and Robust Data Augmentation for Few-Shot Learning
Most previous methods for text data augmentation are limited to simple tasks and weak baselines. We explore data augmentation on hard tasks (i.e., few-shot natural language understanding) and strongExpand
HiddenCut: Simple Data Augmentation for Natural Language Understanding with Better Generalization
A simple yet effective data augmentation technique to better regularize the model and encourage it to learn more generalizable features, HiddenCut, which outperforms the state-of-the-art augmentation methods on the GLUE benchmark, and consistently exhibit superior generalization performances on out-of distribution and challenging counterexamples. Expand
An Empirical Survey of Data Augmentation for Limited Data Learning in NLP
An empirical survey of recent progress on data augmentation for NLP in the limited labeled data setting is provided, summarizing the landscape of methods and carrying out experiments on 11 datasets covering topics/news classification, inference tasks, paraphrasing tasks, and single-sentence tasks. Expand
Substructure Substitution: Structured Data Augmentation for NLP
This work studies a family of data augmentation methods, substructure substitution (SUB), that generalizes prior methods, and presents variations of SUB based on text spans or parse trees, introducing structureaware data augmented methods to general NLP tasks. Expand
A Survey on Data Augmentation for Text Classification
This survey is concerned with data augmentation methods for textual classification and aims to achieve a concise and comprehensive overview for researchers and practitioners. Expand


Semi-Supervised Sequence Modeling with Cross-View Training
Cross-View Training (CVT), a semi-supervised learning algorithm that improves the representations of a Bi-LSTM sentence encoder using a mix of labeled and unlabeled data, is proposed and evaluated, achieving state-of-the-art results. Expand
Unsupervised Data Augmentation
UDA has a small twist in that it makes use of harder and more realistic noise generated by state-of-the-art data augmentation methods, which leads to substantial improvements on six language tasks and three vision tasks even when the labeled set is extremely small. Expand
Exploration of Noise Strategies in Semi-supervised Named Entity Classification
Different noise strategies for the semi-supervised named entity classification task are explored, including statistical methods such as adding Gaussian noise to input embeddings, and linguistically-inspired ones such as dropping words and replacing words with their synonyms. Expand
FLAIR: An Easy-to-Use Framework for State-of-the-Art NLP
The core idea of the FLAIR framework is to present a simple, unified interface for conceptually very different types of word and document embeddings, which effectively hides all embedding-specific engineering complexity and allows researchers to “mix and match” variousembeddings with little effort. Expand
MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification
By mixing labeled, unlabeled and augmented data, MixText significantly outperformed current pre-trained and fined-tuned models and other state-of-the-art semi-supervised learning methods on several text classification benchmarks. Expand
End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF
A novel neutral network architecture is introduced that benefits from both word- and character-level representations automatically, by using combination of bidirectional LSTM, CNN and CRF, thus making it applicable to a wide range of sequence labeling tasks. Expand
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks. Expand
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Sentence-BERT (SBERT), a modification of the pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity is presented. Expand
Knowledge-Augmented Language Model and Its Application to Unsupervised Named-Entity Recognition
The KALM work demonstrates that named entities (and possibly other types of world knowledge) can be modeled successfully using predictive learning and training on large corpora of text without any additional information. Expand
Variational Sequential Labelers for Semi-Supervised Learning
A family of multitask variational methods for semi-supervised sequence labeling that combines a latent-variable generative model and a discriminative labeler, and explores several latent variable configurations, including ones with hierarchical structure. Expand