Learn More
We present a supervised learning approach to identification of anaphoric and non-anaphoric noun phrases and show how such information can be incorporated into a coreference resolution system. The resulting system outperforms the best MUC-6 and MUC-7 coreference resolution systems on the corresponding MUC coref-erence data sets, obtaining F-measures of 66.2(More)
This paper examines two problems in document-level sentiment analysis: (1) determining whether a given document is a review or not, and (2) classifying the polarity of a review as positive or negative. We first demonstrate that review identification can be performed with high accuracy using only unigrams as features. We then examine the role of four types(More)
Traditional learning-based coreference re-solvers operate by training a mention-pair classifier for determining whether two mentions are coreferent or not. Two independent lines of recent research have attempted to improve these mention-pair classifiers, one by learning a mention-ranking model to rank preceding mentions for a given anaphor, and the other by(More)
This paper introduces an unsupervised morphological segmentation algorithm that shows robust performance for four languages with different levels of morphological complexity. In particular, our algorithm outperforms Goldsmith's Lin-guistica and Creutz and Lagus's Mor-phessor for English and Bengali, and achieves performance that is comparable to the best(More)
Most machine learning solutions to noun phrase coreference resolution recast the problem as a classification task. We examine three potential problems with this reformulation, namely, skewed class distributions , the inclusion of " hard " training instances, and the loss of transitivity inherent in the original coreference relation. We show how these(More)
We present a generative model for unsuper-vised coreference resolution that views coref-erence as an EM clustering process. For comparison purposes, we revisit Haghighi and Klein's (2007) fully-generative Bayesian model for unsupervised coreference resolution , discuss its potential weaknesses and consequently propose three modifications to their model.(More)
While world knowledge has been shown to improve learning-based coreference resolvers, the improvements were typically obtained by incorporating world knowledge into a fairly weak baseline resolver. Hence, it is not clear whether these benefits can carry over to a stronger baseline. Moreover, since there has been no attempt to apply different sources of(More)
Supervised polarity classification systems are typically domain-specific. Building these systems involves the expensive process of annotating a large amount of data for each domain. A potential solution to this corpus annotation bottleneck is to build unsupervised polarity classification systems. However, unsupervised learning of polarity is difficult,(More)