Share This Author
Introduction to information retrieval
This textbook teaches classical and web information retrieval, including web search and the related areas of text classification and text clustering from basic concepts, making it perfect for introductory courses in information retrieval for advanced undergraduates and graduate students in computer science.
Graph structure in the Web
Automatic subspace clustering of high dimensional data for data mining applications
CLIQUE is presented, a clustering algorithm that satisfies each of these requirements of data mining applications including the ability to find clusters embedded in subspaces of high dimensional data, scalability, end-user comprehensibility of the results, non-presumption of any canonical data distribution, and insensitivity to the order of input records.
Propagation of trust and distrust
It is shown that a small number of expressed trusts/distrust per individual allows us to predict trust between any two people in the system with high accuracy.
This book introduces the basic concepts in the design and analysis of randomized algorithms and presents basic tools such as probability theory and probabilistic analysis that are frequently used in algorithmic applications.
Trawling the Web for Emerging Cyber-Communities
Randomized rounding: A technique for provably good algorithms and algorithmic proofs
A randomized algorithm for transforming an optimal solution of a relaxed problem into a provably good solution for the 0–1 problem is given and can be extended to provide bounds on the disparity between the rational and 0-1 optima for a given problem instance.
Structure and Evolution Of
By analyzing the structure and content of more than one million blogs worldwide, this work has unearthed some fascinating insights into blogger behavior.
Latent semantic indexing: a probabilistic analysis
- C. Papadimitriou, P. Raghavan, H. Tamaki, S. Vempala
- Computer ScienceACM SIGACT-SIGMOD-SIGART Symposium on Principles…
- 1 May 1998
It is proved that under certain conditions LSI does succeed in capturing the underlying semantics of the corpus and achieves improved retrieval performance.
Symphony: Distributed Hashing in a Small World
- G. Manku, M. Bawa, P. Raghavan
- Computer ScienceUSENIX Symposium on Internet Technologies and…
- 26 March 2003
Symphony, a novel protocol for maintaining distributed hash tables in a wide area network that is scalable, flexible, stable in the presence of frequent updates and offers small average latency with only a handful of long distance links per node.