Learn More
k is the most important parameter in a text categorization system based on k-Nearest Neighbor algorithm (kNN).In the classification process, k nearest documents to the test one in the training set are determined firstly. Then, the predication can be made according to the category distribution among these k nearest neighbors. Generally speaking, the class(More)
In this paper we begin to investigate how to <i>automatically</i> determine the subjectivity orientation of questions posted by real users in community question answering (CQA) portals. Subjective questions seek answers containing private states, such as personal opinion and experience. In contrast, objective questions request objective, verifiable(More)
An increasingly popular method for finding information online is via the Community Question Answering (CQA) portals such as Yahoo! Answers, Naver, and Baidu Knows. Searching the CQA archives, and ranking, filtering, and evaluating the submitted answers requires intelligent processing of the questions and answers posed by the users. One important task is(More)
Temporal information is useful in many NLP applications, such as information extraction, question answering and summarization. In this paper, we present a temporal parser for extracting and normalizing temporal expressions from Chinese texts. An integrated temporal framework is proposed, which includes basic temporal concepts and the classification of(More)
This paper presents a set of experiments on Domain Adaptation of Statistical Machine Translation systems. The experiments focus on Chinese-English and two domain-specific corpora. The paper presents a novel approach for combining multiple domain-trained translation models to achieve improved translation quality for both domain-specific as well as combined(More)
Although promising results have been achieved in the areas of traffic-sign detection and classification, few works have provided simultaneous solutions to these two tasks for realistic real world images. We make two contributions to this problem. Firstly, we have created a large traffic-sign benchmark from 100000 Tencent Street View panoramas, going beyond(More)
In this research, we focus on tracking topics that originate and evolve from a specific event. Intuitively, a few key elements of a target event, such as date, location, and persons involved, would be enough for making a decision on whether a test story is on-topic. Consequently, a profile-based event tracking method is proposed. We attempt to build an(More)
Temporal information is an important attribute of a topic, and a topic usually exists in a limited period. Therefore, many researchers have explored the utilization of temporal information in topic detection and tracking (TDT). They use either a story's publication time or temporal expressions in text to derive temporal relatedness between two stories or a(More)