Learn More
Information extraction systems usually require two dictionaries: a semantic lexicon and a dictionary of extraction patterns for the domain. We present a multi-level bootstrapping algorithm that generates both the semantic lexicon and extraction patterns simultaneously. As input, our technique requires only unan-notated training texts and a handful of seed(More)
This paper presents the results of developing subjectivity classifiers using only unannotated texts for training. The performance rivals that of previous supervised learning approaches. In addition, we advance the state of the art in objective sentence classification by learning extraction patterns associated with objectivity and creating objective(More)
This paper describes a bootstrapping algorithm called Basilisk that learns high-quality semantic lexicons for multiple categories. Basilisk begins with an unannotated corpus and seed words for each semantic category, which are then bootstrapped to learn new words for each category. Basilisk hypothesizes the semantic class of a word based on collective(More)
We aim to shed light on the state-of-the-art in NP coreference resolution by teasing apart the differences in the MUC and ACE task definitions, the assumptions made in evaluation methodologies, and inherent differences in text corpora. First, we examine three subproblems that play a role in coreference resolution: named entity recognition, anaphoric-ity(More)
A common form of sarcasm on Twitter consists of a positive sentiment contrasted with a negative situation. For example, many sarcastic tweets include a positive sentiment, such as " love " or " enjoy " , followed by an expression that describes an undesirable activity or state (e.g., " taking exams " or " being ignored "). We have developed a sarcasm(More)
We present a novel approach to weakly supervised semantic class learning from the web, using a single powerful hyponym pattern combined with graph structures, which capture two properties associated with pattern-based extractions: popularity and productivity. Intuitively , a candidate is popular if it was discovered many times by other instances in the(More)