# Random Projection Trees for Vector Quantization

@article{Dasgupta2009RandomPT, title={Random Projection Trees for Vector Quantization}, author={Sanjoy Dasgupta and Yoav Freund}, journal={IEEE Transactions on Information Theory}, year={2009}, volume={55}, pages={3229-3242} }

A simple and computationally efficient scheme for tree-structured vector quantization is presented. Unlike previous methods, its quantization error depends only on the intrinsic dimension of the data distribution, rather than the apparent dimension of the space in which the data happen to lie.

## 134 Citations

Dimensionality Reduction for k-means Clustering

- Computer ScienceArXiv
- 2020

Four algorithms are presented, two feature selection and two feature extraction based algorithms, all of which are randomized, on how to effectively reduce the dimensions of the k-means clustering problem.

Hopfield Networks for Vector Quantization

- Computer ScienceICANN
- 2020

This work considers the problem of finding representative prototypes within a set of data and solves it using Hopfield networks to minimize the mean discrepancy between kernel density estimates of the distributions of data points and prototypes to suggest that vector quantization can be accomplished via adiabatic quantum computing.

Extremely Fast Unsupervised Codebook Learning for Landmark Recognition

- Computer ScienceIEA/AIE
- 2014

This paper introduces a fast unsupervised codebook learning - Extremely Random Projection Forest ERPF, which is an ensemble of random projection tree with randomly splitting direction and significantly outperforms other spatial tree methods and k-means.

Vector quantization: a review

- Computer ScienceFrontiers of Information Technology & Electronic Engineering
- 2019

Finding a vector quantization method that can strike a balance between speed and accuracy and consume moderately sized memory, is still a problem requiring study.

Randomized Distribution Feature for Image Classification

- Computer ScienceECAI
- 2016

The proposed randomized distribution features that represent the underlying distribution of local features in each image as a vectorial feature by utilizing random Fourier feature prove the convergences of the similarity and distance based on the randomized distribution feature.

Fast nearest neighbor search through sparse random projections and voting

- Computer Science2016 IEEE International Conference on Big Data (Big Data)
- 2016

This work proposes a method where multiple random projection trees are combined by a novel voting scheme to exploit the redundancy in a large number of candidate sets obtained by independently generated random projections in order to reduce the number of expensive exact distance evaluations.

Hierarchical Clustering with Performance Guarantees

- Computer Science
- 2010

Two new algorithms for hierarchical clustering are described, one that is anAlternative to complete linkage, and the other an alternative to the k-d tree, shown to admit stronger performance guarantees than the classical scheme it replaces.

Fast k-NN search

- Computer ScienceBig Data 2015
- 2015

This work proposes a method where multiple random projection trees are combined by a novel voting scheme to exploit the redundancy in a large number of candidate sets obtained by independently generated random projections in order to reduce the number of expensive exact distance evaluations.

Rates of convergence for the cluster tree

- Computer Science, MathematicsNIPS
- 2010

Finite-sample convergence rates for the algorithm and lower bounds on the sample complexity of this estimation problem are given.

Geodesic Forests

- Computer ScienceKDD
- 2020

Fast-BIC, a fast Bayesian Information Criterion statistic for Gaussian mixture models, is developed and demonstrated that GF is robust to high-dimensional noise, whereas other methods, such as Isomap, UMAP, and FLANN, quickly deteriorate in such settings.

## References

SHOWING 1-10 OF 35 REFERENCES

Quantization and the method of k -means

- Computer ScienceIEEE Trans. Inf. Theory
- 1982

Asymptotic results from the statistical theory of k -means clustering are applied to problems of vector quantization. The behavior of quantizers constructed from long training sequences of data is…

Quantization

- Computer ScienceIEEE Trans. Inf. Theory
- 1998

The key to a successful quantization is the selection of an error criterion – such as entropy and signal-to-noise ratio – and the development of optimal quantizers for this criterion.

Foundations of Quantization for Probability Distributions

- Mathematics
- 2000

General properties of the quantization for probability distributions.- Asymptotic quantization for nonsingular probability distributions.- Asymptotic quantization for singular probability…

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation

- Computer ScienceNeural Computation
- 2003

This work proposes a geometrically motivated algorithm for representing the high-dimensional data that provides a computationally efficient approach to nonlinear dimensionality reduction that has locality-preserving properties and a natural connection to clustering.

Optimal pruning with applications to tree-structured source coding and modeling

- Computer ScienceIEEE Trans. Inf. Theory
- 1989

An algorithm introduced by L. Breiman et al. (1984) in the context of classification and regression trees is reinterpreted and extended to cover a variety of applications in source coding and…

Nonlinear dimensionality reduction by locally linear embedding.

- Computer ScienceScience
- 2000

Locally linear embedding (LLE) is introduced, an unsupervised learning algorithm that computes low-dimensional, neighborhood-preserving embeddings of high-dimensional inputs that learns the global structure of nonlinear manifolds.

A global geometric framework for nonlinear dimensionality reduction.

- Computer ScienceScience
- 2000

An approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set and efficiently computes a globally optimal solution, and is guaranteed to converge asymptotically to the true structure.

Clustering Large Graphs via the Singular Value Decomposition

- Computer ScienceMachine Learning
- 2004

This paper considers the problem of partitioning a set of m points in the n-dimensional Euclidean space into k clusters, and considers a continuous relaxation of this discrete problem: find the k-dimensional subspace V that minimizes the sum of squared distances to V of the m points, and argues that the relaxation provides a generalized clustering which is useful in its own right.

Probability: Theory and Examples

- Mathematics
- 1990

This book is an introduction to probability theory covering laws of large numbers, central limit theorems, random walks, martingales, Markov chains, ergodic theorems, and Brownian motion. It is a…

Least squares quantization in PCM

- Computer ScienceIEEE Trans. Inf. Theory
- 1982

The corresponding result for any finite number of quanta is derived; that is, necessary conditions are found that the quanta and associated quantization intervals of an optimum finite quantization scheme must satisfy.