• Publications
  • Influence
Deep RNNs Encode Soft Hierarchical Syntax
TLDR
We present a set of experiments to demonstrate that deep recurrent neural networks (RNNs) learn internal representations that capture soft hierarchical notions of syntax from highly varied supervision. Expand
  • 67
  • 6
  • PDF
Moving Down the Long Tail of Word Sense Disambiguation with Gloss Informed Bi-encoders
TLDR
A major obstacle in Word Sense Disambiguation (WSD) is that word senses are not uniformly distributed, causing existing models to generally perform poorly on senses that are either rare or unseen during training. Expand
  • 22
  • 4
  • PDF
Automatically Processing Tweets from Gang-Involved Youth: Towards Detecting Loss and Aggression
TLDR
We present a corpus of tweets from a young and powerful female gang member and her communicators, which we have annotated with discourse intention, using a deep read to understand how and what triggered conversations to escalate into aggression. Expand
  • 26
  • 2
  • PDF
Better Character Language Modeling Through Morphology
TLDR
We incorporate morphological supervision into character language models (CLMs) via multitasking and show that this addition improves bits-per-character (BPC) performance across 24 languages, even when the morphology data and language modeling data are disjoint. Expand
  • 5
  • PDF
Mining Paraphrasal Typed Templates from a Plain Text Corpus
TLDR
We present an approach which combines distributional and KB-driven methods to allow robust mining of sentence-level paraphrasal templates, utilizing a rich type system for slots, from a plain text corpus. Expand
  • 6
  • PDF
FEWS: Large-Scale, Low-Shot Word Sense Disambiguation with the Dictionary
TLDR
We introduce FEWS (Few-shot Examples of Word Senses), a dataset built to comprehensively train and evaluate WSD models in few- and zero-shot settings. Expand