• Publications
  • Influence
Similarity estimation techniques from rounding algorithms
  • M. Charikar
  • Mathematics, Computer Science
  • STOC '02
  • 19 May 2002
TLDR
It is shown that rounding algorithms for LPs and SDPs used in the context of approximation algorithms can be viewed as locality sensitive hashing schemes for several interesting collections of objects. Expand
Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search
TLDR
This paper proposes a new indexing scheme called multi-probe LSH, built on the well-known LSH technique, but it intelligently probes multiple buckets that are likely to contain query results in a hash table to achieve the same search quality. Expand
Finding Frequent Items in Data Streams
TLDR
This work presents a 1-pass algorithm for estimating the most frequent items in a data stream using limited storage space, which achieves better space bounds than the previously known best algorithms for this problem for several natural distributions on the item frequencies. Expand
Greedy approximation algorithms for finding dense components in a graph
TLDR
This paper gives simple greedy approximation algorithms for these optimization problems of finding subgraphs maximizing these notions of density for undirected and directed graphs and answers an open question about the complexity of the optimization problem for directed graphs. Expand
Finding frequent items in data streams
TLDR
This work presents a 1-pass algorithm for estimating the most frequent items in a data stream using limited storage space, which achieves better space bounds than the previously known best algorithms for this problem for several natural distributions on the item frequencies. Expand
Min-Wise Independent Permutations
TLDR
This research was motivated by the fact that such a family of permutations is essential to the algorithm used in practice by the AltaVista web index software to detect and filter near-duplicate documents. Expand
Efficient k-nearest neighbor graph construction for generic similarity measures
TLDR
N-Descent is presented, a simple yet efficient algorithm for approximate K-NNG construction with arbitrary similarity measures that typically converges to above 90% recall with each point comparing only to several percent of the whole dataset on average. Expand
Approximation algorithms for directed Steiner problems
We obtain the first non-trivial approximation algorithms for the Steiner Tree problem and the Generalized Steiner Tree problem in general directed graphs. Essentially no approximation algorithms wereExpand
The smallest grammar problem
TLDR
This paper shows that every efficient algorithm for the smallest grammar problem has approximation ratio at least 8569/8568 unless P=NP, and bound approximation ratios for several of the best known grammar-based compression algorithms, including LZ78, B ISECTION, SEQUENTIAL, LONGEST MATCH, GREEDY, and RE-PAIR. Expand
Aggregating inconsistent information: Ranking and clustering
TLDR
This work almost settles a long-standing conjecture of Bang-Jensen and Thomassen and shows that unless NP⊆BPP, there is no polynomial time algorithm for the problem of minimum feedback arc set in tournaments. Expand
...
1
2
3
4
5
...