Model-based Gaussian and non-Gaussian clustering

@article{Banfield1993ModelbasedGA,
  title={Model-based Gaussian and non-Gaussian clustering},
  author={Jeffrey D. Banfield and Adrian E. Raftery},
  journal={Biometrics},
  year={1993},
  volume={49},
  pages={803-821}
}
Abstract : The classification maximum likelihood approach is sufficiently general to encompass many current clustering algorithms, including those based on the sum of squares criterion and on the criterion of Friedman and Rubin (1967). However, as currently implemented, it does not allow the specification of which features (orientation, size and shape) are to be common to all clusters and which may differ between clusters. Also, it is restricted to Gaussian distributions and it does not allow… 

Figures and Tables from this paper

Bayesian Non-parametric Parsimonious Gaussian Mixture for Clustering
TLDR
A new Bayesian non-parametric parsimonious mixture model based on an Infinite Gaussian mixture model with an Eigen value decomposition of the covariance matrix of each cluster, and a Chinese Restaurant Process (CRP) prior over the hidden partition is proposed.
Clustering using objective functions and stochastic search
TLDR
A stochastic search algorithm that is driven by a Markov chain that is a mixture of two Metropolis–Hastings algorithms—one that makes small scale changes to individual objects and another that performs large scale moves involving entire clusters.
Simultaneous Gaussian model-based clustering for samples of multiple origins
TLDR
This paper aims to cluster several different datasets at the same time in a context where underlying populations, even though different, are not completely unrelated: All individuals are described by the same features and partitions of identical meaning are expected.
Clustering with the multivariate normal inverse Gaussian distribution
Constrained Optimization for a Subset of the Gaussian Parsimonious Clustering Models
TLDR
A subset of the GPCM family is considered for model-based clustering, where a re-parameterized version of the famous eigenvalue decomposition of the component covariance matrices is used.
Clustering Based on a Multi-layer Mixture Model
TLDR
This paper explores in this paper a clustering approach that models each cluster by a mixture of normals, and develops algorithms to estimate the model and perform clustering based on the classificationmaximum likelihood (CML) and mixture maximum likelihood (MML) criteria.
Identifying connected components in Gaussian finite mixture models for clustering
  • L. Scrucca
  • Computer Science
    Comput. Stat. Data Anal.
  • 2016
Model-Based Clustering With Dissimilarities: A Bayesian Approach
TLDR
The method carries out multidimensional scaling and model-based clustering simultaneously, and yields good object configurations and good clustering results with reasonable measures of clustering uncertainties, and can be used as a tool for dimension reduction when clustering high-dimensional objects.
Model-based Clustering with Dissimilarities : A Bayesian Approach 1
TLDR
The method carries out multidimensional scaling and model-based clustering simultaneously, and yields good object configurations and good clustering results with reasonable measures of clustering uncertainties, and can be used as a tool for dimension reduction when clustering high-dimensional objects.
Model-based clustering with non-elliptically contoured distributions
TLDR
Finite mixtures of the normal inverse Gaussian distribution (and its multivariate extensions) are proposed, which start from a density that allows for skewness and fat tails, generalize the existing models, are tractable and have desirable properties.
...
...

References

SHOWING 1-10 OF 51 REFERENCES
Multivariate Clustering Procedures with Variable Metrics
In this work we discuss several methods used in the clustering of obj ects which can be represented as points in Euclidean space. Moreover, only those procedures are considered that lead to partition
Estimating the components of a mixture of normal distributions
SUMMARY The problem of estimating the components of a mixture of two normal distributions, multivariate or otherwise, with common but unknown covariance matrices is examined. The maximum likelihood
389: Separating Mixtures of Normal Distributions
Suppose a set of data is believed to have been drawn from a population that consists of a known number of k-variate normal distributions, with the same, unknown, dispersion matrix but different
Asymptotic behaviour of classification maximum likelihood estimates
SUMMARY This paper examines maximum likelihood techniques as applied to classification and clustering problems, and shows that the classification maximum likelihood technique, in which individual
9 The classification and mixture maximum likelihood approaches to cluster analysis
  • G. McLachlan
  • Mathematics
    Classification, Pattern Recognition and Reduction of Dimensionality
  • 1982
Inference and Prediction for a General Order Statistic Model with Unknown Population Size.
Abstract Suppose that the first n order statistics from a random sample of N positive random variables are observed, where N is unknown. This, the general order statistic model, has been applied to
Bayes Factors and Choice Criteria for Linear Models
SUMMARY Global and local Bayes factors are defined and their respective roles examined as choice criteria among alternative linear models. The global Bayes factor is seen to function, in appropriate
PATTERN CLUSTERING BY MULTIVARIATE MIXTURE ANALYSIS.
  • J. Wolfe
  • Mathematics
    Multivariate behavioral research
  • 1970
TLDR
The maximum-likelihood theory and numerical solution techniques are developed for a fairly general class of distributions and the feasibility of the procedures is demonstrated by two examples of computer solutions for normal mixture models of the Fisher Iris data and of artify generated clusters with unequal covariance matrices.
On Some Invariant Criteria for Grouping Data
TLDR
This paper attacks the problem of exploring the structure of multivariate data in search of “clusters” by using a computer procedure to obtain the “best” partition of n objects into g groups.
A Monte Carlo Study of the Sampling Distribution of the Likelihood Ratio for Mixtures of Multinormal Distributions
Abstract : Samples from spherical normal distributions were generated and fitted to hypothesized mixtures of normal distributions using the 360 NORMIX computer program for maximum likelihood
...
...