Quanzhi Li

Learn More
However, many text mining applications do not have adequate natural language processing ability beyond simple keyword indexing, and as a result, there are too many textual elements (words) included in the analysis. We argue that noun phrases as textual elements are better suited for text mining and could provide more discriminating power, than single words.(More)
Effectiveness and efficiency of searching and returned results presentation is the key to a search engine. Before downloading and examining the document text, users usually first judge the relevance of a return hit to the query by looking at document metadata presented in the return result. However, the metadata coming with the return hit is usually not(More)
Automated medical concept recognition is important for medical informatics such as medical document retrieval and text mining research. In this paper, we present a software tool called keyphrase identification program (KIP) for identifying topical concepts from medical documents. KIP combines two functions: noun phrase extraction and keyphrase(More)
In this paper, we propose the first real time rumor debunking algorithm for Twitter. We use cues from 'wisdom of the crowds', that is, the aggregate 'common sense' and investigative journalism of Twitter users. We concentrate on identification of a rumor as an event that may comprise of one or more conflicting microblogs. We continue monitoring the rumor(More)
This paper presents a hybrid concept hierarchy development technique for web returned documents retrieved by a meta-search engine. The aim of the technique is to separate the initial retrieved documents into topical oriented categories, prior to the actual concept hierarchy generation. The topical categories correspond to different semantic aspects of the(More)
The paper p resents a hybrid technique for the classification of web returned hits into concept hierarchies. The technique involves a combination of manual and automatic classifiers. At first, all web returned documents are assigned to human defined categories using m anual classifiers, and then automatic classifiers are used to generate a concept hierarchy(More)
With the pervasion of digital textual data, text mining is becoming more and more important to deriving competitive advantages. One factor for successful text mining applications is the ability of finding significant topical terms for discovering interesting patterns or relationships. Document keyphrases are phrases carrying the most important topical(More)