Quantized Compressive K-Means

@article{Schellekens2018QuantizedCK,
  title={Quantized Compressive K-Means},
  author={V. Schellekens and L. Jacques},
  journal={IEEE Signal Processing Letters},
  year={2018},
  volume={25},
  pages={1211-1215}
}
The recent framework of compressive statistical learning proposes to design tractable learning algorithms that use only a heavily compressed representation—or sketch—of massive datasets. Compressive K-Means (CKM) is such a method: It aims at estimating the centroids of data clusters from pooled, nonlinear, and random signatures of the learning examples. While this approach significantly reduces computational time on very large datasets, its digital implementation wastes acquisition resources… Expand
Asymmetric compressive learning guarantees with applications to quantized sketches
TLDR
This work proves that the existing guarantees carry over to this asymmetric scheme, up to a controlled error term, provided some Limited Projected Distortion (LPD) property holds. Expand
Differentially Private Compressive K-means
TLDR
This work modified the standard sketching mechanism to provide differential privacy, using addition of Laplace noise combined with a subsampling mechanism (each moment is computed from a subset of the dataset) to deal with the large scale of datasets. Expand
Compressive learning with privacy guarantees
TLDR
This work shows that a simple perturbation of this mechanism with additive noise is sufficient to satisfy differential privacy, a well established formalism for defining and quantifying the privacy of a random mechanism. Expand
Sketched clustering via hybrid approximate message passing
TLDR
A cluster recovery algorithm based on simplified hybrid generalized approximate message passing (SHyGAMP) is proposed, which is more efficient than the state-of-the-art sketched clustering algorithms (in both computational and sample complexity) and moreefficient than k-means++ in certain regimes. Expand
The k-means Algorithm: A Comprehensive Survey and Performance Evaluation
TLDR
Variants of the k-means algorithms including their recent developments are discussed, where their effectiveness is investigated based on the experimental analysis of a variety of datasets. Expand
Making AI Forget You: Data Deletion in Machine Learning
TLDR
This paper proposes two provably efficient deletion algorithms which achieve an average of over 100X improvement in deletion efficiency across 6 datasets, while producing clusters of comparable statistical quality to a canonical k-means++ baseline. Expand
Compressive Classification (Machine Learning without learning)
TLDR
A compressive learning classification method, and a novel sketch function for images, that combines supervised and unsupervised learning methods for compressed learning of images. Expand
A Novel Model on Reinforce K-Means Using Location Division Model and Outlier of Initial Value for Lowering Data Cost
TLDR
The present study proposed a method of cutting down clustering calculation costs by applying an initial center point approach based on space division and outliers so that no objects would be subordinate to the initial cluster center for dependence lower from theInitial cluster center. Expand
A Study on Correlation Analysis and Application of Communication Network Service
TLDR
A correlation analysis of mobile data, including a correlation analysis between a user’s mobile network business and district, between business and time, andbetween business and business is conducted, useful for achieving the optimization of resource allocation and communication network service provided by operators, contributing to the better guidance of urban planning. Expand
SoyNet: Soybean leaf diseases classification
TLDR
A computer vision approach for plant diseases classification using deep learning convolution neural network, SoyNet, for soybean plant diseases recognition using segmented leaf images that outperforms nine state-of-the-art methods/models. Expand
...
1
2
...

References

SHOWING 1-10 OF 38 REFERENCES
Compressive K-means
TLDR
This work proposes a compressive version of K-means, that estimates cluster centers from a sketch, i.e. from a drastically compressed representation of the training dataset, and demonstrates empirically that CKM performs similarly to Lloyd-Max, for a sketch size proportional to the number of centroids times the ambient dimension, and independent of the size of the original dataset. Expand
Compressive Statistical Learning with Random Feature Moments
TLDR
A general framework --compressive statistical learning-- for resource-efficient large-scale learning: the training collection is compressed in one pass into a low-dimensional sketch that captures the information relevant to the considered learning task. Expand
Sketching for large-scale learning of mixture models
TLDR
This work proposes a "compressive learning" framework where first sketch the data by computing random generalized moments of the underlying probability distribution, then estimate mixture model parameters from the sketch using an iterative algorithm analogous to greedy sparse signal recovery. Expand
Orthogonal Matching Pursuit with Replacement
TLDR
This paper proposes a novel partial hard-thresholding operator that leads to a general family of iterative algorithms that includes Orthogonal Matching Pursuit with Replacement (OMPR), and extends OMPR using locality sensitive hashing to get OMPR-Hash, the first provably sub-linear algorithm for sparse recovery. Expand
K-Means Algorithm Over Compressed Binary Data
  • Elsa Dupraz
  • Computer Science
  • 2018 Data Compression Conference
  • 2018
TLDR
A network of binary-valued sensors with a fusion center is considered and it is shown that applying the K-means algorithm directly over the compressed data without reconstructing the original sensors measurements enables to recover the clusters of the original domain. Expand
SketchMLbox -- A MATLAB toolbox for large-scale mixture learning
TLDR
The SketchMLbox is a Matlab toolbox for fitting mixture models to large databases using sketching techniques and is structured so that new mixture models can be easily implemented. Expand
Representation and Coding of Signal Geometry
TLDR
This paper considers randomized embeddings as an encoding mechanism and provides a framework to analyze their performance, and demonstrates that it is possible to design the embedding such that it represents different ranges of distances with different precision. Expand
An Introduction To Compressive Sampling
TLDR
The theory of compressive sampling, also known as compressed sensing or CS, is surveyed, a novel sensing/sampling paradigm that goes against the common wisdom in data acquisition. Expand
Large-Scale High-Dimensional Clustering with Fast Sketching
TLDR
To cope with high-dimensional datasets, it is shown how to use fast structured random matrices to compute the sketching operator efficiently, and the clustering results are shown to be much more stable, both on artificial and real datasets. Expand
Stable signal recovery from incomplete and inaccurate measurements
Suppose we wish to recover a vector x_0 Є R^m (e.g., a digital signal or image) from incomplete and contaminated observations y = Ax_0 + e; A is an n by m matrix with far fewer rows than columns (n «Expand
...
1
2
3
4
...