Authoritative Sources in a Hyperlinked Environment

  • Jon M. Kleinbergy
  • Published 1998


The link structure of a hypermedia environment can be a rich source of information about the content of the environment, provided we have eeective means for understanding it. Versions of this principle have been studied in the hypertext research community and (in a context predating hypermedia) through journal citation analysis in the eld of bibliometrics. But for the problem of searching in hyperlinked environments such as the World Wide Web, it is clear from the prevalent techniques that the information inherent in the links has yet to be fully exploited. In this work we develop a new method for automatically extracting certain types of information about a hypermedia environment from its link structure, and we report on experiments that demonstrate its eeectiveness for a variety of search problems on the www. The central problem we consider is that of determining the relative \authority" of pages in such environments. This issue is central to a number of basic hypertext search tasks; for example, if the result of a query-based search consists of a large set of relevant pages, one may wish to select a small subset of the most \deenitive" or \authoritative" pages to present to a user. At the same time, it is clearly diicult to formulate a deenition of authority precise enough to be used in such contexts. We propose and test an algorithmic formulation of the notion of authority, based on a method for locating dense bipartite communities in the link structure. Our formulation has an interesting interpretation in terms of the eigenvectors of certain matrices associated with the link graph; this motivates additional heuristics for clustering and for computing a type of link-based similarity among hyperlinked documents.

