- Aristides Gionis, Piotr Indyk, Rajeev Motwani
- VLDB
- 1999

The nearestor near-neighbor query problems arise in a large variety of database applications, usually in the context of similarity searching. Of late, there has been increasing interest in building search/index structures for performing similarity search over high-dimensional data, e.g., image databases, document collections, time-series databases, and… (More)

- Piotr Indyk, Rajeev Motwani
- STOC
- 1998

The nearest neighbor problem is the follolving: Given a set of n points P = (PI, . . . ,p,} in some metric space X, preprocess P so as to efficiently answer queries which require finding bhe point in P closest to a query point q E X. We focus on the particularly interesting case of the d-dimensional Euclidean space where X = Wd under some Zp norm. Despite… (More)

- Mayur Datar, Nicole Immorlica, Piotr Indyk, Vahab S. Mirrokni
- Symposium on Computational Geometry
- 2004

We present a novel Locality-Sensitive Hashing scheme for the Approximate Nearest Neighbor Problem under <i>l</i><sub>p</sub> norm, based on <i>p</i>-stable distributions.Our scheme improves the running time of the earlier algorithm for the case of the <i>l</i><sub>p</sub> norm. It also yields the first known provably efficient approximate NN algorithm for… (More)

- Soumen Chakrabarti, Byron Dom, Piotr Indyk
- SIGMOD Conference
- 1998

A major challenge in indexing unstructured hypertext databases is to automatically extract meta-data that enables structured search using topic taxonomies, circumvents keyword ambiguity, and improves the quality of search and profile-based routing and filtering. Therefore, an accurate classifier is an essential component of a hypertext database. Hyperlinks… (More)

- Piotr Indyk
- FOCS
- 2000

In this article, we show several results obtained by combining the use of <i>stable distributions</i> with <i>pseudorandom generators for bounded space</i>. In particular:---We show that, for any <i>p</i> ∈ (0, 2], one can maintain (using only <i>O</i>(log <i>n</i>/ε<sup>2</sup>) words of storage) a <i>sketch</i> <i>C(q)</i> of a point <i>q</i>… (More)

- Mayur Datar, Aristides Gionis, Piotr Indyk, Rajeev Motwani
- SIAM J. Comput.
- 2002

- Radu Berinde, Anna C. Gilbert, Piotr Indyk, Howard J. Karloff, Martin Strauss
- 2008 46th Annual Allerton Conference on…
- 2008

There are two main algorithmic approaches to sparse signal recovery: geometric and combinatorial. The geometric approach utilizes geometric properties of the measurement matrix Phi. A notable example is the Restricted Isometry Property, which states that the mapping Phi preserves the Euclidean norm of sparse signals; it is known that random dense matrices… (More)

- Edith Cohen, Mayur Datar, +5 authors Cheng Yang
- IEEE Trans. Knowl. Data Eng.
- 2000

Association-rule mining has heretofore relied on the conditionof high support to do its work efficiently. In particular, the well-known a-priori algorithm is only effective when the only rules of interest are relationships that occur very frequently. However, there are a number of applications, such as data mining, identification of similar web documents,… (More)

- Mihai Badoiu, Sariel Har-Peled, Piotr Indyk
- STOC
- 2002

In this paper, we show that for several clustering problems one can extract a small set of points, so that using those <i>core-sets</i> enable us to perform approximate clustering efficiently. The surprising property of those core-sets is that their size is independent of the dimension.Using those, we present a (1+ ε)-approximation algorithms for the… (More)

- Sariel Har-Peled, Piotr Indyk, Rajeev Motwani
- Theory of Computing
- 2012

We present two algorithms for the approximate nearest neighbor problem in high dimensional spaces. For data sets of size n living in IR, the algorithms require space that is only polynomial in n and d, while achieving query times that are sub-linear in n and polynomial in d. We also show applications to other high-dimensional geometric problems, such as the… (More)