Lee-Feng Chien

Learn More
It is crucial for cross-language information retrieval (CLIR) systems to deal with the translation of unknown queries due to that real queries might be short. The purpose of this paper is to investigate the feasibility of exploiting the Web as the corpus source to translate unknown queries for CLIR. We propose an online translation approach to determine(More)
urgent need to promote Chinese in this paper we will raise the significance of keyword extraction using a new PAT-treebased approach, which is efficient in automatic keyword extraction from a set of relevant Chinese documents. This approach has been successfully applied in several IR researches, such as document classification, book indexing and relevance(More)
This paper proposes an effective term suggestion approach to interactive Web search. Conventional approaches to making term suggestions involve extracting co-occurring keyterms from highly ranked retrieved documents. Such approaches must deal with term extraction difficulties and interference from irrelevant documents, and, more importantly, have difficulty(More)
This paper presents a novel approach to improve the named entity translation by combining a transliteration approach with web mining, using web information as a source to complement transliteration, and using transliteration information to guide and enhance web mining. A Maximum Entropy model is employed to rank translation candidates by combining(More)
Many Web information services utilize techniques of information extraction(IE) to collect important facts from the Web. To create more advanced services, one possible method is to discover thematic information from the collected facts through text classification. However, most conventional text classification techniques rely on manual-labelled corpora and(More)
The proliferation of digital video urges the need of video copy detection for content and rights management. An efficient video copy detection technique should be able to deal with spatiotemporal variations (e.g., changes in brightness or frame rates), and lower down the computation cost. While most studies put more emphases on spatial variations, less(More)
It is crucial in many information systems to organize short text segments, such as keywords in documents and queries from users, into a well-formed topic hierarchy. In this paper, we address the problem of generating topic hierarchies for diverse text segments with a general and practical approach that uses the Web as an additional knowledge source. Unlike(More)
Previous works on automatic query clustering most generate a flat, un-nested partition of query terms. In this work, we are pursuing to organize query terms into a hierarchical structure and construct a query taxonomy in an automatic way. The proposed approach is designed based on a hierarchical agglomerative clustering algorithm to hierarchically group(More)