Share This Author
Graph structure in the Web
A taxonomy of web search
- A. Broder
- Computer ScienceSIGF
- 1 September 2002
This taxonomy of web searches is explored and how global search engines evolved to deal with web-specific needs is discussed.
On the resemblance and containment of documents
- A. Broder
- Computer ScienceProceedings. Compression and Complexity of…
- 11 June 1997
The basic idea is to reduce these issues to set intersection problems that can be easily evaluated by a process of random sampling that could be done independently for each document.
Summary cache: a scalable wide-area web cache sharing protocol
This paper demonstrates the benefits of cache sharing, measures the overhead of the existing protocols, and proposes a new protocol called "summary cache", which reduces the number of intercache protocol messages, reduces the bandwidth consumption, and eliminates 30% to 95% of the protocol CPU overhead, all while maintaining almost the same cache hit ratios as ICP.
Network Applications of Bloom Filters: A Survey
The aim of this paper is to survey the ways in which Bloom filters have been used and modified in a variety of network problems, with the aim of providing a unified mathematical and practical framework for understanding them and stimulating their use in future applications.
Syntactic Clustering of the Web
Min-Wise Independent Permutations
- A. Broder, M. Charikar, A. Frieze, M. Mitzenmacher
- Mathematics, Computer ScienceJ. Comput. Syst. Sci.
- 1 June 2000
This research was motivated by the fact that such a family of permutations is essential to the algorithm used in practice by the AltaVista web index software to detect and filter near-duplicate documents.
Efficient query evaluation using a two-level retrieval process
- A. Broder, David Carmel, Michael Herscovici, A. Soffer, J. Zien
- Computer ScienceCIKM '03
- 3 November 2003
An efficient query evaluation method based on a two level approach that significantly reduces the total number of full evaluations by more than 90%, almost without any loss in precision or recall.
It is shown that with high probability, the fullest box contains only ln ln n/ln 2 + O(1) balls---exponentially less than before and a similar gap exists in the infinite process, where at each step one ball, chosen uniformly at random, is deleted, and one ball is added in the manner above.