• Corpus ID: 249954039

Quantifying Distances Between Clusters with Elliptical or Non-Elliptical Shapes

  title={Quantifying Distances Between Clusters with Elliptical or Non-Elliptical Shapes},
  author={Meredith L. Wallace and Lisa M. McTeague and Jessica L. Graves and Nicholas Kissel and Cristina Tortora and Bradley Wheeler and Satish Iyengar},
Finite mixture models that allow for a broad range of potentially non-elliptical clus- ter distributions is an emerging methodological field. Such methods allow for the shape of the clusters to match the natural heterogeneity of the data, rather than forcing a series of elliptical clusters. These methods are highly relevant for clustering continuous non-normal data – a common occurrence with objective data that are now routinely captured in health research. However, interpreting and comparing… 

Figures and Tables from this paper



On mixtures of skew normal and skew $$t$$-distributions

A systematic classification of the existing skew symmetric distributions into four types is presented, thereby clarifying their close relationships and aiding in understanding the link between some of the proposed expectation-maximization based algorithms for the computation of the maximum likelihood estimates of the parameters of the models.

A mixture of generalized hyperbolic distributions

The ability of the authors' models to recover parameters for data from underlying Gaussian and skew‐t distributions is demonstrated and the role of generalized hyperbolic mixtures within the wider model‐based clustering, classification, and density estimation literature is discussed.

mclust 5: Clustering, Classification and Density Estimation Using Gaussian Finite Mixture Models

This updated version of mclust adds new covariance structures, dimension reduction capabilities for visualisation, model selection criteria, initialisation strategies for the EM algorithm, and bootstrap-based inference, making it a full-featured R package for data analysis via finite mixture modelling.

Mixtures of Shifted AsymmetricLaplace Distributions

This work marks an important step in the non-Gaussian model-based clustering and classification direction, and a variant of the EM algorithm is developed for parameter estimation by exploiting the relationship with the generalized inverse Gaussian distribution.

Model-Based Clustering, Classification, and Discriminant Analysis Using the Generalized Hyperbolic Distribution: MixGHD R package

The MixGHD package for R performs model-based clustering, classification, and discriminant analysis using the generalized hyperbolic distribution (GHD), and the use of the package on real datasets is shown.

Heterogeneity Coefficients for Mahalanobis' D as a Multivariate Effect Size

Two heterogeneity coefficients for D based on the Gini coefficient are presented, a well-known index of inequality among values of a distribution, and their use is illustrated by reanalyzing some published findings from studies of gender differences.

A metric for distributions with applications to image databases

This paper uses the Earth Mover's Distance to exhibit the structure of color-distribution and texture spaces by means of Multi-Dimensional Scaling displays, and proposes a novel approach to the problem of navigating through a collection of color images, which leads to a new paradigm for image database search.

Sinkhorn Divergences for Unbalanced Optimal Transport

The formulation of Sinkhorn divergences is extended to the unbalanced setting of arbitrary positive measures, providing both theoretical and algorithmic advances, and a linear rate of convergence is shown, under mild assumptions, independent of the number of samples.

The Earth Mover's Distance as a Metric for Image Retrieval

This paper investigates the properties of a metric between two distributions, the Earth Mover's Distance (EMD), for content-based image retrieval, and compares the retrieval performance of the EMD with that of other distances.

Automated high-dimensional flow cytometric data analysis

This work presents a direct multivariate finite mixture modeling approach, using skew and heavy-tailed distributions, to address the complexities of flow cytometric analysis and to deal with high-dimensional cytometric data without the need for projection or transformation.