We study the classification of news articles into emotions they invoke in their readers. Our work differs from previous studies, which focused on the classification of documents into their authors' emotions instead of the readers'. We use various combinations of feature sets to find the best combination for identifying the emotional influences of news… (More)
In this paper, we investigate the emotion classification of web blog corpora using support vector machine (SVM) and conditional random field (CRF) machine learning techniques. The emotion classifiers are trained at the sentence level and applied to the document level. Our methods also determine an emotion category by taking the context of a sentence into… (More)
An emotion lexicon is an indispensable resource for emotion analysis. This paper aims to mine the relationships between words and emotions using weblog corpora. A collocation model is proposed to learn emotion lexicons from weblog articles. Emotion classification at sentence level is experimented by using the mined lexicons to demonstrate their usefulness.
Past studies on emotion classification focus on the writer’s emotional state. This research addresses the reader aspect instead. The classification of documents into reader-emotion categories has several applications. One of them is to integrate reader-emotion classification into a web search engine to allow users to retrieve documents that contain… (More)
This paper introduces the novel research of emotion analysis from both the writer's and reader's perspectives. A challenge that comes up is the lack of a corpus annotated with both writer and reader emotions. We tackle this problem by combining an online writer-emotion corpus and an online reader-emotion corpus. Statistical analyses are then performed on… (More)
This paper presents two approaches to ranking reader emotions of documents. Past studies assign a document to a single emotion category , so their methods cannot be applied directly to the emotion ranking problem. Furthermore, whereas previous research analyzes emotions from the writer's perspective, this work examines readers' emotional states. The first… (More)
Identifying intent boundary in search query logs is important for learning users' behaviors and applying their experiences. Time-based, query-based, and cluster-based approaches are proposed. Experiments show that the integration of intent clusters and dynamic time model performs the best.
Gene Ontology (GO) is developed to provide standard vocabularies of gene products in different databases. The process of annotating GO terms to genes requires curators to read through lengthy articles. Methods for speeding up or automating the annotation process are thus of great importance. We propose a GO annotation approach using full-text biomedical… (More)
In this paper, we present an approach for retrieving relevant articles from the biomedical corpus. Our first run considered four kinds of operators as query expansion. The operators are phrase, mandatory, optional and synonym set. The second run lowered the ranking of documents which contained query terms only in their MeSH fields. The results of the… (More)
In this paper, we present a system for information retrieval of biomedical texts at passage level. Our system used KL-divergence as the underlying retrieval model. We further added query expansion and performed post-processing on the results. We were able to obtain a Document MAP of 0.3563, Passage MAP of 0.0464 and Aspect MAP of 0.2255 on one of the three… (More)