• Publications
  • Influence
Distant supervision for relation extraction without labeled data
TLDR
This work investigates an alternative paradigm that does not require labeled corpora, avoiding the domain dependence of ACE-style algorithms, and allowing the use of corpora of any size. Expand
Cheap and Fast - But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks
TLDR
This work explores the use of Amazon's Mechanical Turk system, a significantly cheaper and faster method for collecting annotations from a broad base of paid non-expert contributors over the Web, and proposes a technique for bias correction that significantly improves annotation quality on two tasks. Expand
Learning Syntactic Patterns for Automatic Hypernym Discovery
TLDR
This paper presents a new algorithm for automatically learning hypernym (is-a) relations from text, using "dependency path" features extracted from parse trees and introduces a general-purpose formalization and generalization of these patterns. Expand
Semantic Taxonomy Induction from Heterogenous Evidence
TLDR
This work proposes a novel algorithm for inducing semantic taxonomies that flexibly incorporates evidence from multiple classifiers over heterogenous relationships to optimize the entire structure of the taxonomy, using knowledge of a word's coordinate terms to help in determining its hypernyms, and vice versa. Expand
Learning to Merge Word Senses
TLDR
A discriminative classifier is trained over a wide variety of features derived from WordNet structure, corpus-based evidence, and evidence from other lexical resources, and a learned similarity measure outperforms previously proposed automatic methods for sense clustering on the task of predicting human sense merging judgments. Expand
Smoothing techniques for adaptive online language models: topic tracking in tweet streams
TLDR
Experiments show that unigram language models smoothed using a normalized extension of stupid backoff and a simple queue for history retention performs well on the task of tracking broad topics in continuous streams of short texts from the microblogging service Twitter. Expand
Learning Named Entity Hyponyms for Question Answering
TLDR
It is demonstrated how a recently developed statistical approach to mining such relations can be tailored to identify named entity hyponyms, and how as a result, superior question answering performance can be obtained. Expand
Effectively Using Syntax for Recognizing False Entailment
TLDR
A novel framework for recognizing textual entailment that focuses on the use of syntactic heuristics to recognize false entailment is presented, which demonstrates state-of-the-art performance on a widely-used test set. Expand
A combinatorial problem associated with nonograms
Associated with an m × n matrix with entries 0 or 1 are the m-vector of row sums and n-vector of column sums. In this article we study the set of all pairs of these row and column sums for fixed mExpand
Semantic taxonomy induction
TLDR
This work addresses four key problems in automatically reading and understanding text: extracting the knowledge expressed in a body of text in the form of structured relations, reconciling and formalizing that knowledge in a fully consistent, sense-disambiguated hierarchy of knowledge, fluidly transitioning from fine-grained to coarse- grained distinctions between word senses, and applying extracted structured knowledge in applications that depend on deep textual understanding. Expand
...
1
2
...