A Distributed and Approximated Nearest Neighbors Algorithm for an Efficient Large Scale Mean Shift Clustering

@article{Beck2019ADA,
  title={A Distributed and Approximated Nearest Neighbors Algorithm for an Efficient Large Scale Mean Shift Clustering},
  author={Ga{\"e}l Beck and Tarn Duong and Mustapha Lebbah and Hanene Azzag and Christophe C{\'e}rin},
  journal={ArXiv},
  year={2019},
  volume={abs/1902.03833}
}
Parallel Locality Sensitive Hashing for Network Discovery from Time Series
  • Computer Science
  • 2022
TLDR
This thesis proposes a scalable system based on locality sensitive hashing by implementing it in parallel with independent hash functions, and concludes with discussion of the impact of the similarity measures on the network discovery results, as well as proposing further investigations into other parts of the parameter space.
Decreasing the execution time of reducers by revising clustering based on the futuristic greedy approach
TLDR
Cutting the number of reducers and revising the clustering helped reducers to perform their jobs almost simultaneously and improve the execution time by about 3.9% less than the fastest algorithm.
Incomplete Gamma Kernels: Generalizing Locally Optimal Projection Operators
TLDR
Complete gamma kernels, a generalization of Locally Optimal Projection (LOP) operators, are presented and the relation of the classical localized L 1 estimator, used in the LOP operator for surface reconstruction from noisy point clouds, is revealed via a novel kernel.
Intelligent Recommendation System for E-Learning using Membership Optimized Fuzzy Logic Classifier
TLDR
The main aim of this paper is to provide personalized dynamic and continuous recommendations for online learning systems using intelligent techniques and the superiority of the proposed method is proved by the performance analysis in terms of various performance measures.
Enhancement of email spam detection using improved deep learning algorithms for cyber security
TLDR
Experimental outcomes show the ability of the proposed method to perform the spam email classification based on improved deep learning.
Segmentation of the Fabric Pattern Based on Improved Fruit Fly Optimization Algorithm
In order to improve the segmentation performance of the printed fabric pattern, a segmentation criterion based on the 3D maximum entropy which is optimized by an improved fruit fly optimization
Optimal feature selection and hybrid deep learning for direct marketing campaigns in banking applications
TLDR
A new direct marketing campaign model in banking applications using a hybrid deep learning architecture with optimal feature selection performed by a new variant of a meta-heuristic algorithm termed as Self Adaptive-Sea Lion Optimization (SA-SLnO) Algorithm.
Aspect based sentiment analysis for demonetization tweets by optimized recurrent neural network using fire fly-oriented multi-verse optimizer
In this paper, it is proposed to understand the opinion of the public regarding the policy of demonetization that is implemented recently in India through Aspect-based Sentiment Analysis (ABSA) that
...
1
2
3
...

References

SHOWING 1-10 OF 37 REFERENCES
Distributed mean shift clustering with approximate nearest neighbours
TLDR
Two further algorithmic improvements are introduced: a normal scale (NS) choice of the optimal number of nearest neighbours, and locality sensitive hashing (LSH) to approximate nearest neighbour searches to offer the potential for an efficient method for Big Data Clustering.
Nearest neighbour estimators of density derivatives, with application to mean shift clustering
DBDC: Density Based Distributed Clustering
TLDR
The complex problem of finding a suitable quality measure for evaluating distributed clusterings is discussed and two quality criteria which are compared to each other and which allow us to evaluate the quality of the DBDC algorithm are introduced.
Mean shift-based clustering
Locality-sensitive hashing scheme based on p-stable distributions
TLDR
A novel Locality-Sensitive Hashing scheme for the Approximate Nearest Neighbor Problem under lp norm, based on p-stable distributions that improves the running time of the earlier algorithm and yields the first known provably efficient approximate NN algorithm for the case p<1.
Mean Shift, Mode Seeking, and Clustering
  • Yizong Cheng
  • Computer Science
    IEEE Trans. Pattern Anal. Mach. Intell.
  • 1995
TLDR
Mean shift, a simple interactive procedure that shifts each data point to the average of data points in its neighborhood is generalized and analyzed and makes some k-means like clustering algorithms its special cases.
Data Clustering
TLDR
Top researchers from around the world explore the characteristics of clustering problems in a variety of application areas and explain how to glean detailed insight from the clustering process including how to verify the quality of the underlying cluster through supervision, human intervention, or the automated generation of alternative clusters.
Quick Shift and Kernel Methods for Mode Seeking
We show that the complexity of the recently introduced medoid-shift algorithm in clustering N points is O(N 2), with a small constant, if the underlying distance is Euclidean. This makes medoid shift
Mean Shift: A Robust Approach Toward Feature Space Analysis
TLDR
It is proved the convergence of a recursive mean shift procedure to the nearest stationary point of the underlying density function and, thus, its utility in detecting the modes of the density.
Locality-Sensitive Hashing for Finding Nearest Neighbors
TLDR
This lecture note describes a technique known as locality-sensitive hashing (LSH) that allows one to quickly find similar entries in large databases using a novel and interesting class of algorithms known as randomized algorithms.
...
1
2
3
4
...