We study the classification of news articles into emotions they invoke in their readers. Our work differs from previous studies, which focused on the classification of documents into their authors' emotions instead of the readers'. We use various combinations of feature sets to find the best combination for identifying the emotional influences of news… (More)
Most of the common techniques in text retrieval are based on the statistical analysis of a term either as a word or a phrase. Statistical analysis of a term frequency captures the importance of the term within a document only. Thus, to achieve a more accurate analysis, the underlying representation should indicate terms that capture the semantics of text.… (More)
An emotion lexicon is an indispensable resource for emotion analysis. This paper aims to mine the relationships between words and emotions using weblog corpora. A collocation model is proposed to learn emotion lexicons from weblog articles. Emotion classification at sentence level is experimented by using the mined lexicons to demonstrate their usefulness.
This paper introduces the novel research of emotion analysis from both the writer's and reader's perspectives. A challenge that comes up is the lack of a corpus annotated with both writer and reader emotions. We tackle this problem by combining an online writer-emotion corpus and an online reader-emotion corpus. Statistical analyses are then performed on… (More)
Past studies on emotion classification focus on the writer’s emotional state. This research addresses the reader aspect instead. The classification of documents into reader-emotion categories has several applications. One of them is to integrate reader-emotion classification into a web search engine to allow users to retrieve documents that contain… (More)
This paper presents two approaches to ranking reader emotions of documents. Past studies assign a document to a single emotion category , so their methods cannot be applied directly to the emotion ranking problem. Furthermore, whereas previous research analyzes emotions from the writer's perspective, this work examines readers' emotional states. The first… (More)
Identifying intent boundary in search query logs is important for learning users' behaviors and applying their experiences. Time-based, query-based, and cluster-based approaches are proposed. Experiments show that the integration of intent clusters and dynamic time model performs the best.
Gene Ontology (GO) is developed to provide standard vocabularies of gene products in different databases. The process of annotating GO terms to genes requires curators to read through lengthy articles. Methods for speeding up or automating the annotation process are thus of great importance. We propose a GO annotation approach using full-text biomedical… (More)
Molecular targeted drugs are clinically effective anti-cancer therapies. However, tumours treated with single agents usually develop resistance. Here we use colorectal cancer (CRC) as a model to study how the acquisition of resistance to EGFR-targeted therapies can be restrained. Pathway-oriented genetic screens reveal that CRC cells escape from EGFR… (More)
In this paper, we propose an approach for doing Gene Ontology (GO) annotation on full-text biomedical articles. This system explores the word proximity relationship between genes and GO terms. We associate genes and GO terms by considering the density function between gene-GO pairs in a paragraph. Different density models are built and several evaluation… (More)