Few-Shot Representation Learning for Out-Of-Vocabulary Words

@inproceedings{Hu2019FewShotRL,
  title={Few-Shot Representation Learning for Out-Of-Vocabulary Words},
  author={Ziniu Hu and Ting Chen and Kai-Wei Chang and Yizhou Sun},
  booktitle={ACL},
  year={2019}
}
Existing approaches for learning word embedding often assume there are sufficient occurrences for each word in the corpus, such that the representation of words can be accurately estimated from their contexts. However, in real-world scenarios, out-of-vocabulary (a.k.a. OOV) words that do not appear in training corpus emerge frequently. How to learn accurate representations of these words to augment a pre-trained embedding by only a few observations is a challenging research problem. In this… 

Figures and Tables from this paper

Deep learning models for representing out-of-vocabulary words
TLDR
Although the results indicated that the best technique for handling OOV words is different for each task, Comick, a deep learning method that infers the embedding based on the context and the morphological structure of the OOV word, obtained promising results.
Robust Backed-off Estimation of Out-of-Vocabulary Embeddings
TLDR
Experimental results show that the obtained oov word embeddings improve not only word similarity tasks but also downstream tasks in Twitter and biomedical domains where oov words often appear, even when the computed oovembeddings are integrated into a bert-based strong baseline.
Estimator Vectors: OOV Word Embeddings based on Subword and Context Clue Estimates
  • R. Patel, C. Domeniconi
  • Computer Science
    2020 International Joint Conference on Neural Networks (IJCNN)
  • 2020
TLDR
A neural network model that learns high quality word representations, subword representations, and context clue representations jointly, which is competitive with state of the art methods for OOV estimation.
Imputing Out-of-Vocabulary Embeddings with LOVE Makes LanguageModels Robust with Little Cost
TLDR
A simple contrastive learning framework, LOVE, which extends the word representation of an existing pre-trained language model and makes it robust to OOV with few additional parameters, and can be used in a plug-and-play fashion with FastText and BERT, where it significantly improves their robustness.
Lacking the Embedding of a Word? Look it up into a Traditional Dictionary
TLDR
Two methods are introduced: Definition Neural Network (DefiNNet) and Define BERT (DefBERT), which significantly outperform state-of-the-art as well as baseline methods devised for producing embeddings of unknown words.
FMEBA: A Fusion Multi-feature Model for Chinese Out of Vocabulary Word Embedding Generation
TLDR
A Fusion Multi-feature Encoder Based on Attention (FMEBA) is proposed for processing Chinese OOV words, in which the radical feature of characters is used as well as character-level Transformer Encoder to process character sequence information and context information.
Learning to Learn Words from Visual Scenes
TLDR
A meta-learning framework that learns how to learn word representations from unconstrained scenes using the natural compositional structure of language to create training episodes that cause a meta-learner to learn strong policies for language acquisition is introduced.
Meta Learning and Its Applications to Natural Language Processing
TLDR
This tutorial intends to facilitate researchers in the NLP community to understand the new technology better and promote more research studies using this new technology, Meta-learning, which is one of the most important new techniques in machine learning in recent years.
Relative and Absolute Location Embedding for Few-Shot Node Classification on Graph
TLDR
A novel model called Relative and Absolute Location Embedding (RALE) hinged on the concept of hub nodes is proposed, which captures the task-level dependency by assigning each node a relative location within a task, as well as the graph- level dependency by assign each node an absolute location on the graph to further align different tasks toward learning a transferable prior.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 33 REFERENCES
A La Carte Embedding: Cheap but Effective Induction of Semantic Feature Vectors
TLDR
A la carte embedding is introduced, a simple and general alternative to the usual word2vec-based approaches for building such representations that is based upon recent theoretical results for GloVe-like embeddings.
Mimicking Word Embeddings using Subword RNNs
TLDR
MIMICK is presented, an approach to generating OOV word embeddings compositionally, by learning a function from spellings to distributionalembeddings by performing learning at the type level of the original word embedding corpus.
Enriching Word Vectors with Subword Information
TLDR
A new approach based on the skipgram model, where each word is represented as a bag of character n-grams, with words being represented as the sum of these representations, which achieves state-of-the-art performance on word similarity and analogy tasks.
High-risk learning: acquiring new word vectors from tiny data
TLDR
This paper shows that a neural language model such as Word2Vec only necessitates minor modifications to its standard architecture to learn new terms from tiny data, using background knowledge from a previously learnt semantic space.
Diverse Few-Shot Text Classification with Multiple Metrics
TLDR
This work proposes an adaptive metric learning approach that automatically determines the best weighted combination from a set of metrics obtained from meta-training tasks for a newly seen few-shot task.
Better Word Representations with Recursive Neural Networks for Morphology
TLDR
This paper combines recursive neural networks, where each morpheme is a basic unit, with neural language models to consider contextual information in learning morphologicallyaware word representations and proposes a novel model capable of building representations for morphologically complex words from their morphemes.
Matching Networks for One Shot Learning
TLDR
This work employs ideas from metric learning based on deep neural features and from recent advances that augment neural networks with external memories to learn a network that maps a small labelled support set and an unlabelled example to its label, obviating the need for fine-tuning to adapt to new class types.
Deep Contextualized Word Representations
TLDR
A new type of deep contextualized word representation is introduced that models both complex characteristics of word use and how these uses vary across linguistic contexts, allowing downstream models to mix different types of semi-supervision signals.
Deep Learning of Representations for Unsupervised and Transfer Learning
  • Yoshua Bengio
  • Computer Science
    ICML Unsupervised and Transfer Learning
  • 2012
TLDR
Why unsupervised pre-training of representations can be useful, and how it can be exploited in the transfer learning scenario, where the authors care about predictions on examples that are not from the same distribution as the training distribution.
...
1
2
3
4
...