SWOW-8500: Word Association task for Intrinsic Evaluation of Word Embeddings

  title={SWOW-8500: Word Association task for Intrinsic Evaluation of Word Embeddings},
  author={Avijit Thawani and Anil Kumar Singh},
  journal={Proceedings of the 3rd Workshop on Evaluating Vector Space Representations for},
Downstream evaluation of pretrained word embeddings is expensive, more so for tasks where current state of the art models are very large architectures. Intrinsic evaluation using word similarity or analogy datasets, on the other hand, suffers from several disadvantages. We propose a novel intrinsic evaluation task employing large word association datasets (particularly the Small World of Words dataset). We observe correlations not just between performances on SWOW-8500 and previously proposed… 

Figures from this paper

A Comparative Study on Word Embeddings in Deep Learning for Text Classification
This study recommends choosing CNN over BiLSTM for document classification datasets where the context in sequence is not as indicative of class membership as sentence datasets, and concatenation of multiple classic embeddings or increasing their size does not lead to a statistically significant difference in performance despite a slight improvement in some cases.
Intrinsic analysis for dual word embedding space models
Two classical embedding methods belonging to two different methodologies are compared - Word2Vec from window-based and Glove from count-based - and the preference of non-default model for 2 out of 3 tasks is showcased.
Word associations and the distance properties of context-aware word embeddings
It is found that word embeddings are good mod- els of some word associations properties, and, like humans, their context-aware variants show violations of the triangle in- equality.
Using Word Embeddings and Collocations for Modelling Word Associations
Three models aimed at different types of word associations are evaluated: a word-embedding model for synonymy, a point-wise mutual information model for word collocations, and a dependency model for common properties of words.
Playing Codenames with Language Graphs and Word Embeddings
An algorithm that can generate Codenames clues from the language graph BabelNet or from any of several embedding methods – word2vec, GloVe, fastText or BERT is proposed or a weighting term called DETECT that incorporates dictionary-based word representations and document frequency to improve clue selection is proposed.
A survey on Recognizing Textual Entailment as an NLP Evaluation
It is argued that when evaluating NLP systems, the community should utilize newly introduced RTE datasets that focus on specific linguistic phenomena that can be used to evaluate N LP systems on a fine-grained level.
WinoGAViL: Gamified Association Benchmark to Challenge Vision-and-Language Models
This work introduces WinoGAViL: an online game to collect vision-and-language associations, used as a dynamic benchmark to evaluate state-of-the-art models and indicates that the collected associations require diverse reasoning skills, including general knowledge, common sense, abstraction, and more.
ParaDis and Démonette – From Theory to Resources for Derivational Paradigms
In this article, we trace the genesis of the French derivational database Démonette and show how its architecture and content stem from recent theoretical developments in derivational morphology and


Evaluating Word Embeddings Using a Representative Suite of Practical Tasks
This work proposes evaluating word embeddings in vivo by evaluating them on a suite of popular downstream tasks by using simple models with few tuned hyperparameters.
Problems With Evaluation of Word Embeddings Using Word Similarity Tasks
It is suggested that the use of word similarity tasks for evaluation of word vectors is not sustainable and calls for further research on evaluation methods.
Intrinsic and Extrinsic Evaluations of Word Embeddings
The semantic composition of word embeddings is analyzed by cross-referencing their clusters with the manual lexical database, WordNet, and it is shown that the word embedding clusters give high correlations to the synonym and hyponym sets in WordNet.
Large-scale learning of word relatedness with constraints
A large-scale data mining approach to learning word-word relatedness, where known pairs of related words impose constraints on the learning process, and learns for each word a low-dimensional representation, which strives to maximize the likelihood of a word given the contexts in which it appears.
Massively Multilingual Word Embeddings
New methods for estimating and evaluating embeddings of words in more than fifty languages in a single shared embedding space are introduced and a new evaluation method is shown to correlate better than previous ones with two downstream tasks.
GloVe: Global Vectors for Word Representation
A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.
Better Word Representations with Recursive Neural Networks for Morphology
This paper combines recursive neural networks, where each morpheme is a basic unit, with neural language models to consider contextual information in learning morphologicallyaware word representations and proposes a novel model capable of building representations for morphologically complex words from their morphemes.
Efficient Estimation of Word Representations in Vector Space
Two novel model architectures for computing continuous vector representations of words from very large data sets are proposed and it is shown that these vectors provide state-of-the-art performance on the authors' test set for measuring syntactic and semantic word similarities.
Enriching Word Vectors with Subword Information
A new approach based on the skipgram model, where each word is represented as a bag of character n-grams, with words being represented as the sum of these representations, which achieves state-of-the-art performance on word similarity and analogy tasks.
Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors
An extensive evaluation of context-predicting models with classic, count-vector-based distributional semantic approaches, on a wide range of lexical semantics tasks and across many parameter settings shows that the buzz around these models is fully justified.