• Publications
  • Influence
MAD-X: An Adapter-based Framework for Multi-task Cross-lingual Transfer
MAD-X is proposed, an adapter-based framework that enables high portability and parameter-efficient transfer to arbitrary tasks and languages by learning modular language and task representations and introduces a novel invertible adapter architecture and a strong baseline method for adapting a pretrained multilingual model to a new language.
Monolingual and Cross-Lingual Information Retrieval Models Based on (Bilingual) Word Embeddings
A novel word representation learning model called Bilingual Word Embeddings Skip-Gram (BWESG) is presented which is the first model able to learn bilingual word embeddings solely on the basis of document-aligned comparable data.
On the Limitations of Unsupervised Bilingual Dictionary Induction
It is shown that a simple trick, exploiting a weak supervision signal from identical words, enables more robust induction and establishes a near-perfect correlation between unsupervised bilingual dictionary induction performance and a previously unexplored graph similarity metric.
AdapterHub: A Framework for Adapting Transformers
AdaptersHub is proposed, a framework that allows dynamic “stiching-in” of pre-trained adapters for different tasks and languages that enables scalable and easy access to sharing of task-specific models, particularly in low-resource scenarios.
Semantic Specialization of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints
The evaluation shows that the Attract-Repel method can make use of existing cross-lingual lexicons to construct high-quality vector spaces for a plethora of different languages, facilitating semantic transfer from high- to lower-resource ones.
SimVerb-3500: A Large-Scale Evaluation Set of Verb Similarity
SimVerb-3500, an evaluation resource that provides human ratings for the similarity of 3,500 verb pairs, is introduced, hoping that it will enable a richer understanding of the diversity and complexity of verb semantics and guide the development of systems that can effectively represent and interpret this meaning.
A Survey of Cross-lingual Word Embedding Models
A comprehensive typology of cross-lingual word embedding models is provided, showing that many of the models presented in the literature optimize for the same objectives, and that seemingly different models are often equivalent modulo optimization strategies, hyper-parameters, and such.
Skip N-grams and Ranking Functions for Predicting Script Events
This work aims to answer key questions about how best to identify representative event chains from a source text, gather statistics from the event chains, and choose ranking functions for predicting new script events.
HyperLex: A Large-Scale Evaluation of Graded Lexical Entailment
We introduce HyperLex—a data set and evaluation resource that quantifies the extent of the semantic category membership, that is, type-of relation, also known as hyponymy–hypernymy or lexical
ConveRT: Efficient and Accurate Conversational Representations from Transformers
The proposed ConveRT (Conversational Representations from Transformers), a pretraining framework for conversational tasks satisfying all the following requirements: it is effective, affordable, and quick to train, and promises wider portability and scalability for Conversational AI applications.