• Publications
  • Influence
A Primer in BERTology: What We Know About How BERT Works
TLDR
We review the current state of knowledge about how BERT works, what kind of information it learns and how it is represented, common modifications to its training objectives and architecture, and approaches to compression. Expand
  • 193
  • 20
  • PDF
Revealing the Dark Secrets of BERT
TLDR
We propose a methodology and offer the first detailed analysis of BERT’s capacity to capture different kinds of linguistic information in its self-attention weights. Expand
  • 158
  • 19
  • PDF
Analogy-based detection of morphological and semantic relations with word embeddings: what works and what doesn't
TLDR
We present a balanced test set with 99,200 questions in 40 categories, and we systematically examine how accuracy for different categories is affected by window size and dimensionality of the SVD-based word embeddings. Expand
  • 118
  • 15
  • PDF
Word Embeddings, Analogies, and Machine Learning: Beyond king - man + woman = queen
TLDR
We show that the information not detected by linear offset may still be recoverable by a more sophisticated search method, and thus is actually encoded in the embedding. Expand
  • 77
  • 15
  • PDF
RuSentiment: An Enriched Sentiment Analysis Dataset for Social Media in Russian
TLDR
This paper presents RuSentiment, a new dataset for sentiment analysis of social media posts in Russian, and a new set of comprehensive annotation guidelines that are extensible to other languages. Expand
  • 23
  • 6
  • PDF
Getting Closer to AI Complete Question Answering: A Set of Prerequisite Real Tasks
TLDR
We present QuAIL, the first RC dataset to combine text-based, world knowledge and unanswerable questions, and to provide question type annotation that would enable diagnostics of the reasoning strategies by a given QA system. Expand
  • 14
  • 4
Intrinsic Evaluations of Word Embeddings: What Can We Do Better?
TLDR
This paper presents an analysis of existing methods for the intrinsic evaluation of word embeddings. Expand
  • 68
  • 3
  • PDF
Adversarial Decomposition of Text Representation
TLDR
In this paper, we present a method for adversarial decomposition of text representation. Expand
  • 22
  • 3
  • PDF
Investigating Different Syntactic Context Types and Context Representations for Learning Word Embeddings
TLDR
We provide a systematical investigation of 4 different syntactic context types and context representations for learning word embeddings. Expand
  • 28
  • 2
  • PDF
The (too Many) Problems of Analogical Reasoning with Word Vectors
TLDR
We argue against such “linguistic regularities” as a model for linguistic relations in vector space models and as a benchmark, and we show that the vector offset (as well as two other, better-performing methods) suffers from dependence on vector similarity. Expand
  • 42
  • 1
  • PDF
...
1
2
3
...