LINSPECTOR: Multilingual Probing Tasks for Word Representations

@article{Sahin2020LINSPECTORMP,
  title={LINSPECTOR: Multilingual Probing Tasks for Word Representations},
  author={G{\"o}zde G{\"u}l Sahin and Clara Vania and Ilia Kuznetsov and Iryna Gurevych},
  journal={Computational Linguistics},
  year={2020},
  volume={46},
  pages={335-385}
}
Despite an ever-growing number of word representation models introduced for a large number of languages, there is a lack of a standardized technique to provide insights into what is captured by these models. Such insights would help the community to get an estimate of the downstream task performance, as well as to design more informed neural architectures, while avoiding extensive experimentation that requires substantial computational resources not all researchers have access to. A recent… 

LINSPECTOR WEB: A Multilingual Probing Suite for Word Representations

TLDR
LINSPECTOR WEB is an open source multilingual inspector to analyze word representations and supports probing of static word embeddings along with pretrained AllenNLP models that are commonly used for NLP downstream tasks such as named entity recognition, natural language inference and dependency parsing.

AlephBERT: Language Model Pre-training and Evaluation from Sub-Word to Sentence Level

TLDR
AlephBERT is presented, a large PLM for Modern Hebrew, trained on larger vocabulary and a larger dataset than any Hebrew PLM before, and a novel neural architecture is introduced that recovers the morphological segments encoded in contextualized embedding vectors.

Intrinsic Probing through Dimension Selection

TLDR
This paper proposes a novel framework based on a decomposable multivariate Gaussian probe that allows us to determine whether the linguistic information in word embeddings is dispersed or focal, and probes fastText and BERT for various morphosyntactic attributes across 36 languages.

Shaking Syntactic Trees on the Sesame Street: Multilingual Probing with Controllable Perturbations

TLDR
It is found that the syntactic sensitivity depends on the language and model pre-training objectives, and the sensitivity grows across layers together with the increase of the perturbation granularity, and it is shown that the models barely use the positional information to induce syntactic trees from their intermediate self-attention and contextualized representations.

RuSentEval: Linguistic Source, Encoder Force!

TLDR
RuSentEval is introduced, an enhanced set of 14 probing tasks for Russian, including ones that have not been explored yet, to explore the distribution of various linguistic properties in five multilingual transformers for two typologically contrasting languages.

Investigating Language Relationships in Multilingual Sentence Encoders Through the Lens of Linguistic Typology

TLDR
This article proposes methods for separating language-specific subspaces within state-of-the-art multilingual sentence encoders (LASER, M-BERT, XLM, and XLM-R) with respect to a range of typological properties pertaining to lexical, morphological, and syntactic structure and investigates how typological information about languages is distributed across all layers of the models.

A multilabel approach to morphosyntactic probing

TLDR
It is shown that multilingual BERT renders many morphosyntactic features easily and simultaneously extractable (e.g., gender, grammatical case, pronominal type) and has the added benefit of revealing the linguistic properties that language models recognize as being shared across languages.

Attention Understands Semantic Relations

TLDR
It is shown that in this task, attention scores express the information about relations similar to the layers’ output activations despite their lesser ability to represent surface cues, supporting the hypothesis that attention mechanisms focus not only on syntactic relational information but semantic as well.

Information-Theoretic Probing for Linguistic Structure

TLDR
An information-theoretic operationalization of probing as estimating mutual information that contradicts received wisdom: one should always select the highest performing probe one can, even if it is more complex, since it will result in a tighter estimate, and thus reveal more of the linguistic information inherent in the representation.

What does it mean to be language-agnostic? Probing multilingual sentence encoders for typological properties

TLDR
This work proposes methods for probing sentence representations from state-of-the-art multilingual encoders with respect to a range of typological properties pertaining to lexical, morphological and syntactic structure and shows interesting differences in encoding linguistic variation associated with different pretraining strategies.

References

SHOWING 1-10 OF 96 REFERENCES

LINSPECTOR WEB: A Multilingual Probing Suite for Word Representations

TLDR
LINSPECTOR WEB is an open source multilingual inspector to analyze word representations and supports probing of static word embeddings along with pretrained AllenNLP models that are commonly used for NLP downstream tasks such as named entity recognition, natural language inference and dependency parsing.

Language Modeling for Morphologically Rich Languages: Character-Aware Modeling for Word-Level Prediction

TLDR
The main technical contribution of this work is a novel method for injecting subword-level information into semantic word vectors, integrated into the neural language modeling training, to facilitate word-level prediction.

Enriching Word Vectors with Subword Information

TLDR
A new approach based on the skipgram model, where each word is represented as a bag of character n-grams, with words being represented as the sum of these representations, which achieves state-of-the-art performance on word similarity and analogy tasks.

Better Word Representations with Recursive Neural Networks for Morphology

TLDR
This paper combines recursive neural networks, where each morpheme is a basic unit, with neural language models to consider contextual information in learning morphologicallyaware word representations and proposes a novel model capable of building representations for morphologically complex words from their morphemes.

Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks

TLDR
This work proposes a framework that facilitates better understanding of the encoded representations of sentence vectors and demonstrates the potential contribution of the approach by analyzing different sentence representation mechanisms.

XNLI: Evaluating Cross-lingual Sentence Representations

TLDR
This work constructs an evaluation set for XLU by extending the development and test sets of the Multi-Genre Natural Language Inference Corpus to 14 languages, including low-resource languages such as Swahili and Urdu and finds that XNLI represents a practical and challenging evaluation suite and that directly translating the test data yields the best performance among available baselines.

Compositional Representation of Morphologically-Rich Input for Neural Machine Translation

TLDR
This paper replaces the source-language embedding layer of NMT with a bi-directional recurrent neural network that generates compositional representations of the input at any desired level of granularity and consistently outperforms NMT learning embeddings of statistically generated sub-word units.

From Characters to Words to in Between: Do We Capture Morphology?

TLDR
None of the character-level models match the predictive accuracy of a model with access to true morphological analyses, even when learned from an order of magnitude more data.

GloVe: Global Vectors for Word Representation

TLDR
A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.

Character-Aware Neural Language Models

TLDR
A simple neural language model that relies only on character-level inputs that is able to encode, from characters only, both semantic and orthographic information and suggests that on many languages, character inputs are sufficient for language modeling.
...