Kai-Hsiang Yang

Learn More
Today, bibliographic digital libraries play an important role in helping members of academic community search for novel research. In particular, author disambiguation for citations is a major problem during the data integration and cleaning process, since author names are usually very ambiguous. For solving this problem, we proposed two kinds of(More)
The most prevalent peer-to-peer (P2P) application till today is file sharing, and unstructured P2P networks can support inherent heterogeneity of peers, are highly resilient to peers' failures, and incur low overhead at peer arrivals and departures. Dynamic querying (DQ) is a new flooding technique which could estimate a proper time-to-live (TTL) value for(More)
In this paper, expert-finding problem is transformed to a classification issue. We build a knowledge database to represent the expertise characteristic of domain from web information constructed by collaborative intelligence, and an incremental learning method is proposed to update the database. Furthermore, results are ranked by measuring the correlation(More)
Searching for information about a particular person is a common activity on search engines. However, current search engines do not provide any special function for search a person. Previous research has solved the problem by using additional background knowledge, such as a friend list, to cluster the searched web pages. However, it is still difficult to(More)
—Building an expert finding system is very important for many applications especially in the academic environment. Previous work uses e-mails or web pages as corpus to analyze the expertise for each expert. In this paper, we present an Expert Finding System, abbreviated as EFS to build experts' profiles by using their journal publications. For a given(More)
The dramatic increase in the number of academic publications has led to a growing demand for efficient organization of the resources to meet researchers' specific needs. As a result, a number of network services have compiled databases from the public resources scattered over the Internet. However, publications in different conferences and journals follow(More)
—How to rapidly disseminate a large-sized file to many recipients is a fundamental problem in many applications, such as updating software patches and distributing large scientific data sets. In this paper, we present the Bee protocol, which is a cooperative peer-to-peer data dissemination protocol aiming at minimizing the maximum dissemination time for all(More)
Identification of distinct clusters of documents in text collections has traditionally been addressed by making the assumption that the data instances can only be represented by homogeneous and uniform features. Many real-world data, on the other hand, comprise of multiple types of heterogeneous interrelated components, such as web pages and hyperlinks,(More)