Learn More
This paper presents an opinion analysis system developed by CUHK_PolyU_Tsinghua Web Information Analysis Group (WIA), namely WIA-Opinmine, for NTCIR-7 MOAT Task. Different from most existing opinion mining systems, which recognize opinionated sentences as one-step classification procedure, WIA-Opinmine adopts a multi-pass coarse-fine analysis strategy. A(More)
This paper presents the CUHK opinion analysis system, namely Opinmine, for the NTCIR-6 pilot task. Opinmine comprises of three functional modules: (1) Preprocessing and Assignment Module (PAM) performs word segmentation, part-of-speech (POS) tagging and named entity recognition on the input Chinese text. It is based on lexicalized Hidden Markov Model and(More)
Lyric-based song sentiment classification seeks to assign songs appropriate sentiment labels such as light-hearted and heavy-hearted. Four problems render vector space model (VSM)-based text classification approach ineffective: 1) Many words within song lyrics actually contribute little to sentiment; 2) Nouns and verbs used to express sentiment are(More)
Two challenging issues are notable in tweet clustering. Firstly, the sparse data problem is serious since no tweet can be longer than 140 characters. Secondly, synonymy and polysemy are rather common because users intend to present a unique meaning with a great number of manners in tweets. Enlightened by the recent research which indicates Wikipedia is(More)
Manually labeling documents for training a text classifier is expensive and time-consuming. Moreover, a classifier trained on labeled documents may suffer from overfitting and adaptability problems. Dataless text classification (DLTC) has been proposed as a solution to these problems, since it does not require labeled documents. Previous research in DLTC(More)
The Web holds valuable, vast, and unstructured information about public opinion. Here, the history, current use, and future of opinion mining and sentiment analysis are discussed, along with relevant techniques and tools. of information were friends and specialized magazine or websites. Now, the " social web " provides new tools to efficiently create and(More)
In the era of Web 2.0, huge volumes of consumer reviews are posted to the Internet every day. Manual approaches to detecting and analyzing fake reviews (i.e., spam) are not practical due to the problem of information overload. However, the design and development of automated methods of detecting fake reviews is a challenging research problem. The main(More)