• Publications
  • Influence
Evaluating Models’ Local Decision Boundaries via Contrast Sets
TLDR
A more rigorous annotation paradigm for NLP that helps to close systematic gaps in the test data, and recommends that the dataset authors manually perturb the test instances in small but meaningful ways that (typically) change the gold label, creating contrast sets. Expand
Polyglot Semantic Role Labeling
TLDR
Analysis of the polyglot models’ performance provides a new understanding of the similarities and differences between languages in the shared task, and results in improvement in parsing performance on several languages over a monolingual baseline. Expand
Polyglot Contextual Representations Improve Crosslingual Transfer
TLDR
Rosita is introduced, a method to produce multilingual contextual word representations by training a single language model on text from multiple languages, which provides further evidence for the benefits of polyglot learning, in which representations are shared across multiple languages. Expand
Evaluating NLP Models via Contrast Sets
TLDR
A new annotation paradigm for NLP is proposed that helps to close systematic gaps in the test data, and it is recommended that after a dataset is constructed, the dataset authors manually perturb the test instances in small but meaningful ways that change the gold label, creating contrast sets. Expand
Low-Resource Parsing with Crosslingual Contextualized Representations
TLDR
The non-contextual part of the learned language models are examined to demonstrate that polyglot language models better encode crosslingual lexical correspondence compared to aligned monolingual language models, providing further evidence that polyGLot training is an effective approach toCrosslingual transfer. Expand
Grounded Compositional Outputs for Adaptive Language Modeling
TLDR
This work proposes a fully compositional output embedding layer for language models, which is further grounded in information from a structured lexicon (WordNet), namely semantically related words and free-text definitions, and is the first word-level language model with a size that does not depend on the training vocabulary. Expand
Technology-Enabled Disinformation: Summary, Lessons, and Recommendations
TLDR
This report summarizes the space of technology-enabled mis- and disinformation based on investigations, and surface the lessons and recommendations for technologists, researchers, platform designers, policymakers, and users. Expand