• Corpus ID: 202760132

CODEC - Detecting Linear Correlations in Dense Clusters using coMAD-based PCA

  title={CODEC - Detecting Linear Correlations in Dense Clusters using coMAD-based PCA},
  author={Maximilian Archimedes Xaver H{\"u}nem{\"o}rder and Anna Beer and Daniyal Kazempour and T. Seidl},
The coMAD (co-median absolute deviation) is a measure for the joint median of two random variables. Previous experiments have shown that a coMAD-based PCA is more robust towards noise and outliers, yielding eigenvectors which represent linear correlation better than its covariance-based competitors. In this preliminary work we introduce CODEC COrrelations in DEnse Clusters a method for detecting linear correlations in dense clusters utilizing a coMAD-based PCA. The idea of CODEC is intriguingly… 

Figures from this paper


Global Correlation Clustering Based on the Hough Transform
An efficient and effective method that can find subspace clusters of different dimensionality even if they are sparse or are intersected by other clusters within a noisy environment is proposed.
Computing Clusters of Correlation Connected objects
This paper proposes a method called 4C (Computing Correlation Connected Clusters), based on a combination of PCA and density-based clustering, to identify local subgroups of the data objects sharing a uniform but arbitrarily complex correlation.
Finding generalized projected clusters in high dimensional spaces
Very general techniques for projected clustering are discussed which are able to construct clusters in arbitrarily aligned subspaces of lower dimensionality, which is substantially more general and realistic than currently available techniques.
A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise
DBSCAN, a new clustering algorithm relying on a density-based notion of clusters which is designed to discover clusters of arbitrary shape, is presented which requires only one input parameter and supports the user in determining an appropriate value for it.
On Mad and Comedians
A popular robust measure of dispersion of a random variable (rv) X is the median absolute deviation from the median med(|X - med(X)|), MAD for short, which is based on the median med(X) of X. By