# Optimal Cluster Preserving Embedding of Nonmetric Proximity Data

@article{Roth2003OptimalCP, title={Optimal Cluster Preserving Embedding of Nonmetric Proximity Data}, author={Volker Roth and Julian Laub and Motoaki Kawanabe and Joachim M. Buhmann}, journal={IEEE Trans. Pattern Anal. Mach. Intell.}, year={2003}, volume={25}, pages={1540-1551} }

For several major applications of data analysis, objects are often not represented as feature vectors in a vector space, but rather by a matrix gathering pairwise proximities. Such pairwise data often violates metricity and, therefore, cannot be naturally embedded in a vector space. Concerning the problem of unsupervised structure detection or clustering, in this paper, a new embedding method for pairwise data into Euclidean vector spaces is introduced. We show that all clustering methods…

## 208 Citations

Structure Preserving Embedding of Dissimilarity Data

- Computer ScienceSimilarity-Based Pattern Analysis and Recognition
- 2013

The Pairwise Clustering cost function is shown to exhibit a shift invariance property which basically means that any symmetric dissimilarity matrix can be modified to allow a vector-space representation without distorting the optimal group structure.

Clustering Very Large Dissimilarity Data Sets

- Computer ScienceANNPR
- 2010

Together, an efficient linear time data inspection method for general dissimilarity data structures results and the theoretical background as well as applications to the areas of text and multimedia processing based on the generalized compression distance are presented.

Dealing with non-metric dissimilarities in fuzzy central clustering algorithms

- Computer ScienceInt. J. Approx. Reason.
- 2009

Distributional Scaling: An Algorithm for Structure-Preserving Embedding of Metric and Nonmetric Spaces

- Computer ScienceJ. Mach. Learn. Res.
- 2004

It is demonstrated that the embedding algorithm used in this paper preserves the structural properties of embedded data better than traditional MDS, and that its performance is robust with respect to clustering errors in the original data.

Topographic Mapping of Large Dissimilarity Data Sets

- Computer ScienceNeural Computation
- 2010

Relational topographic maps are introduced as an extension of relational clustering algorithms, which offer prototype-based representations of dissimilarity data, to incorporate neighborhood structure and are equivalent to the standard techniques if a Euclidean embedding exists, while preventing the need to explicitly compute such an embedding.

Incremental Embedding Within a Dissimilarity-Based Framework

- Computer ScienceGbRPR
- 2015

The pros and cons of two methods, which allow computing implicitly, and separately the embedding of points in the test set and in the learning set, based on dissimilarity representations are studied.

Feature Discovery in Non-Metric Pairwise Data

- Computer ScienceJ. Mach. Learn. Res.
- 2004

It is shown by a simple, exploratory analysis that the negative eigenvalues can code for relevant structure in the data, thus leading to the discovery of new features, which were lost by conventional data analysis techniques.

Two density-based k-means initialization algorithms for non-metric data clustering

- Computer SciencePattern Analysis and Applications
- 2014

A density-based clusters’ representatives selection algorithm that identifies the most central patterns from the dense regions in the dataset using a probability density function built through the Parzen estimator, which relies on a (not necessarily metric) dissimilarity measure.

A method of relational fuzzy clustering based on producing feature vectors using FastMap

- Computer ScienceInf. Sci.
- 2009

Beyond Traditional Kernels: Classification in Two Dissimilarity-Based Representation Spaces

- Computer ScienceIEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews)
- 2008

Two simple yet powerful alternatives to learn from proximity data for which kernel methods cannot directly be applied, are too costly or impractical, while the NN rule leads to noisy results are provided.

## References

SHOWING 1-10 OF 27 REFERENCES

A theory of proximity based clustering: structure detection by optimization

- Computer SciencePattern Recognit.
- 2000

Pairwise Data Clustering by Deterministic Annealing

- Computer ScienceIEEE Trans. Pattern Anal. Mach. Intell.
- 1997

A deterministic annealing approach to pairwise clustering is described which shares the robustness properties of maximum entropy inference and the resulting Gibbs probability distributions are estimated by mean-field approximation.

Clustering in large graphs and matrices

- Computer ScienceSODA '99
- 1999

It is argued that in fact the relaxation provides a generalized clustering which is useful in its own right and can be applied to problems of very large size which typically arise in modern applications.

Data clustering: a review

- Computer ScienceCSUR
- 1999

An overview of pattern clustering methods from a statistical pattern recognition perspective is presented, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners.

Nonlinear dimensionality reduction by locally linear embedding.

- Computer ScienceScience
- 2000

Locally linear embedding (LLE) is introduced, an unsupervised learning algorithm that computes low-dimensional, neighborhood-preserving embeddings of high-dimensional inputs that learns the global structure of nonlinear manifolds.

Classification with Nonmetric Distances: Image Retrieval and Class Representation

- Computer ScienceIEEE Trans. Pattern Anal. Mach. Intell.
- 2000

It is shown that in nonmetric spaces, boundary points are less significant for capturing the structure of a class than in Euclidean spaces, and it is suggested that atypical points may be more important in describing classes.

A global geometric framework for nonlinear dimensionality reduction.

- Computer ScienceScience
- 2000

An approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set and efficiently computes a globally optimal solution, and is guaranteed to converge asymptotically to the true structure.

Kernel PCA and De-Noising in Feature Spaces

- Computer ScienceNIPS
- 1998

This work presents ideas for finding approximate pre-images, focusing on Gaussian kernels, and shows experimental results using these pre- images in data reconstruction and de-noising on toy examples as well as on real world data.

Normalized cuts and image segmentation

- Computer ScienceProceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition
- 1997

This work treats image segmentation as a graph partitioning problem and proposes a novel global criterion, the normalized cut, for segmenting the graph, which measures both the total dissimilarity between the different groups as well as the total similarity within the groups.

Investigation of measures for grouping by graph partitioning

- Computer ScienceProceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001
- 2001

Using probabilistic analysis and a rigorous empirical evaluation, it is shown that the minimization of the average cut and the normalized cut measure, using recursive bi-partitioning will, on an average, result in the correct segmentation.