Prabhakar Raghavan

Learn More
Introduction to Information Retrieval is the first textbook with a coherent treatment of classical and web information retrieval, including web search and the related areas of text classification and text clustering. Written from a computer science perspective, it gives an up-to-date treatment of all aspects of the design and implementation of systems for(More)
optimization problem, 275, 277<lb>adaptive adversary, 373<lb>Adleman's Theorem, 39<lb>Adleman,<lb>L., 41, 410, 426<lb>Aggarwal,<lb>A, 362<lb>Aho, AV., 25, 187, 189, 302<lb>Ahuja, R.K., 303<lb>Ajtai, M., 156, 160, 361<lb>Albers, S., 389<lb>Aldous, DJ., 64, 155, 332<lb>Aleliunas,<lb>R, 96, 155<lb>Alford, W.R, 426<lb>all-pairs shortest paths, 278-288,(More)
The study of the web as a graph is not only fascinating in its own right, but also yields valuable insight into web algorithms for crawling, searching and community discovery, and the sociological phenomena which characterize its evolution. We report on experiments on local and global properties of the web graph using two Altavista crawls each with over 200(More)
Data mining applications place special requirements on clustering algorithms including: the ability to find clusters embedded in subspaces of high dimensional data, scalability, end-user comprehensibility of the results, non-presumption of any canonical data distribution, and insensitivity to the order of input records. We present CLIQUE, a clustering(More)
A (directed) network of people connected by ratings or trust scores, and a model for propagating those trust scores, is a fundamental building block in many of today's most successful e-commerce and recommendation systems. We develop a framework of trust propagation schemes, each of which may be appropriate in certain circumstances, and evaluate the schemes(More)
The relation of an integer program to its rational relaxation has been the subject of considerable interest [l], [5], [11]. Such efforts fall into two categories: (1) Showing existence results for feasible solutions to an integer program in terms of the solution to its rational relaxation, and (2) Using the information derived from the solution of the(More)
We present Symphony, a novel protocol for maintaining distributed hash tables in a wide area network. The key idea is to arrange all participants along a ring and equip them with long distance contacts drawn from a family of harmonic distributions. Through simulation, we demonstrate that our construction is scalable, exible, stable in the presence of(More)
We consider the problem of approximating an integer program by first solving its relaxation linear program and "rounding" the resulting solution. For several packing problems, we prove probabilistically that there exists an integer solution close to the optimum of the relaxation solution. We then develop a methodology for converting such a probabilistic(More)
The web harbors a large number of communities -groups of content-creators sharing a common interest -each of which manifests itself as a set of interlinked web pages. Newgroups and commercial web directories together contain of the order of 20000 such communities; our particular interest here is on emerging communities -those that have little or no(More)
Latent semantic indexing (LX) is an information retrieval technique based on the spectral analysis of the term-document matrix, whose empirical success had heretofort been without rigorous prediction and explanation. We prove that, under certain conditions, LSI does succeed in capturing the underlying semantics of the corpus and achieves improved retrieval(More)