• Publications
  • Influence
Disfluency Detection with a Semi-Markov Model and Prosodic Features
We present a discriminative model for detecting disfluencies in spoken language transcripts. Structurally, our model is a semiMarkov conditional random field with features targeting characteristics
Semi-Supervised Event Extraction with Paraphrase Clusters
TLDR
This work presents a method for self-training event extraction systems by bootstrapping additional training data by taking advantage of the occurrence of multiple mentions of the same event instances across newswire articles from multiple sources.
IIRC: A Dataset of Incomplete Information Reading Comprehension Questions
TLDR
A dataset with more than 13K questions over paragraphs from English Wikipedia that provide only partial information to answer them, with the missing information occurring in one or more linked documents, finding that it achieves 31.1% F1 on this task, while estimated human performance is 88.4%.
University of Washington TAC-KBP 2016 System Description
TLDR
The University of Washington’s event extraction system was composed of three components: Evento, a CRFbased extractor, NomEvent, which makes use a lexicon to build features to identify nominal triggers, and NewsSpike, which uses an unsupervised training process to produce a highprecision extractor.
Identifying Domain Adjacent Instances for Semantic Parsers
TLDR
This work formalizes the problem of domain-adjacency and proposes a new simple sentence representation that emphasizes words which are unexpected, which improves the performance of a downstream semantic parser run on in-domain and domain-adjacent instances.