Using dependency parsing for few-shot learning in distributional semantics

  title={Using dependency parsing for few-shot learning in distributional semantics},
  author={Stefania Preda and Guy Edward Toh Emerson},
In this work, we explore the novel idea of employing dependency parsing information in the context of few-shot learning, the task of learning the meaning of a rare word based on a limited amount of context sentences. Firstly, we use dependency-based word embedding models as background spaces for few-shot learning. Secondly, we introduce two few-shot learning methods which enhance the additive baseline model by using dependencies. 

Figures and Tables from this paper


Bad Form: Comparing Context-Based and Form-Based Few-Shot Learning in Distributional Semantic Models
It is shown that hyperparameters that have largely been ignored in previous work can consistently improve the performance of both baseline and advanced models, achieving a new state of the art on 4 out of 6 tasks.
Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling
A version of graph convolutional networks (GCNs), a recent class of neural networks operating on graphs, suited to model syntactic dependency graphs, is proposed, observing that GCN layers are complementary to LSTM ones.
Learning Semantic Representations for Novel Words: Leveraging Both Form and Context
This paper proposes an architecture that leverages both sources of information - surface-form and context - and shows that it results in large increases in embedding quality, and can be integrated into any existing NLP system and enhance its capability to handle novel words.
High-risk learning: acquiring new word vectors from tiny data
This paper shows that a neural language model such as Word2Vec only necessitates minor modifications to its standard architecture to learn new terms from tiny data, using background knowledge from a previously learnt semantic space.
Improving Distributional Similarity with Lessons Learned from Word Embeddings
It is revealed that much of the performance gains of word embeddings are due to certain system design choices and hyperparameter optimizations, rather than the embedding algorithms themselves, and these modifications can be transferred to traditional distributional models, yielding similar gains.
Learning Word Embeddings without Context Vectors
This work suggests using indefinite inner product in skip-gram negative sampling algorithm, which allows for only one sort of vectors in word embedding algorithms, and performs on par with SGNS on word similarity datasets.
Words are Vectors, Dependencies are Matrices: Learning Word Embeddings from Dependency Graphs
This work proposes a new dependency-based DSM that has an inherent ability to represent dependency chains as products of matrices which provides a straightforward way of handling further contexts of a word.
Cross-Lingual Word Embeddings for Low-Resource Language Modeling
This work investigates the use of bilingual lexicons to improve language models when textual training data is limited to as few as a thousand sentences, and involves learning cross-lingual word embeddings as a preliminary step in training monolingual language models.
Distributional Models of Word Meaning
This review presents the state of the art in distributional semantics, focusing on its assets and limits as a model of meaning and as a method for semantic analysis.
What are the Goals of Distributional Semantics?
A broad linguistic perspective is taken, looking at how well current models can deal with various semantic challenges in distributional semantic models, and concludes that future progress will require balancing the often conflicting demands of linguistic expressiveness and computational tractability.