Learn More
Evaluative texts on the Web have become a valuable source of opinions on products, services, events, individuals, etc. Recently, many researchers have studied such opinion sources as product reviews, forum posts, and blogs. However, existing research has been focused on classification and summarization of opinions using natural language processing and data(More)
This paper studies the problem of identifying comparative sentences in text documents. The problem is related to but quite different from sentiment/opinion sentence identification or classification. Sentiment classification studies the problem of classifying a document or a sentence based on the subjective opinion of the author. An important application(More)
This paper aims to detect users generating spam reviews or review spammers. We identify several characteristic behaviors of review spammers and model these behaviors so as to detect the spammers. In particular, we seek to model the following behaviors. First, spammers may target specific products or product groups in order to maximize their impact. Second,(More)
This paper studies a text mining problem, comparative sentence mining. A comparative sentence expresses an ordering relation between two sets of entities with respect to some common features. For example, the comparative sentence “Canon’s optics are better than those of Sony and Nikon” expresses the comparative relation: (better, {optics}, {Canon}, {Sony,(More)
It is now a common practice for e-commerce Web sites to enable their customers to write reviews of products that they have purchased. Such reviews provide valuable sources of information on these products. They are used by potential customers to find opinions of existing users before deciding to purchase a product. They are also used by product(More)
It is well-known that many online reviews are not written by genuine users of products, but by spammers who write <i>fake reviews</i> to promote or demote some target products. Although some existing works have been done to detect fake reviews and individual spammers, to our knowledge, no work has been done on detecting spammer groups. This paper focuses on(More)
Mining of opinions from product reviews, forum posts and blogs is an important research topic with many applications. However, existing research has been focused on extraction, classification and summarization of opinions from these sources. An important issue that has not been studied so far is the opinion spam or the trustworthiness of online opinions. In(More)
This work is about a novel methodology for window detection in urban environments and its multiple use in vision system applications. The presented method for window detection includes appropriate early image processing, provides a multi-scale Haar wavelet representation for the determination of image tiles which is then fed into a cascaded classifier for(More)
This paper studies structured data extraction from Web pages. One of the effective methods is tree matching, which can detect template patterns from web pages used for extraction. However, one major limitation of existing tree matching algorithms is their inability to deal with embedded lists with repeated patterns. In the Web context, lists are everywhere,(More)