• Publications
  • Influence
A Framework for Clustering Evolving Data Streams
TLDR
A fundamentally different philosophy for data stream clustering is discussed which is guided by application-centered requirements and uses the concepts of a pyramidal time frame in conjunction with a microclustering approach.
Neural Networks and Deep Learning
  • C. Aggarwal
  • Computer Science
    Springer International Publishing
  • 2018
Fast algorithms for projected clustering
TLDR
An algorithmic framework for solving the projected clustering problem, in which the subsets of dimensions selected are specific to the clusters themselves, is developed and tested.
On the Surprising Behavior of Distance Metrics in High Dimensional Spaces
TLDR
This paper examines the behavior of the commonly used L k norm and shows that the problem of meaningfulness in high dimensionality is sensitive to the value of k, which means that the Manhattan distance metric is consistently more preferable than the Euclidean distance metric for high dimensional data mining applications.
Outlier Analysis
  • C. Aggarwal
  • Computer Science
    Springer New York
  • 11 January 2013
TLDR
Outlier Analysis is a comprehensive exposition, as understood by data mining experts, statisticians and computer scientists, and emphasis was placed on simplifying the content, so that students and practitioners can also benefit.
On the design and quantification of privacy preserving data mining algorithms
TLDR
It is proved that the EM algorithm converges to the maximum likelihood estimate of the original distribution based on the perturbed data, and proposed metrics for quantification and measurement of privacy-preserving data mining algorithms are proposed.
Graph Clustering
  • C. Aggarwal
  • Computer Science
    Encyclopedia of Machine Learning
  • 2010
Data Mining: The Textbook
This textbook explores the different aspects of data mining from the fundamentals to the complex data types and their applications, capturing the wide diversity of problem domains for data mining
Finding generalized projected clusters in high dimensional spaces
High dimensional data has always been a challenge for clustering algorithms because of the inherent sparsity of the points. Recent research results indicate that in high dimensional data, even the
Outlier detection for high dimensional data
TLDR
New techniques for outlier detection which find the outliers by studying the behavior of projections from the data set are discussed.
...
1
2
3
4
5
...