Model-based Gaussian and non-Gaussian clustering

@article{Banfield1993ModelbasedGA,
  title={Model-based Gaussian and non-Gaussian clustering},
  author={Jeffrey D. Banfield and Adrian E. Raftery},
  journal={Biometrics},
  year={1993},
  volume={49},
  pages={803-821}
}
Abstract : The classification maximum likelihood approach is sufficiently general to encompass many current clustering algorithms, including those based on the sum of squares criterion and on the criterion of Friedman and Rubin (1967). However, as currently implemented, it does not allow the specification of which features (orientation, size and shape) are to be common to all clusters and which may differ between clusters. Also, it is restricted to Gaussian distributions and it does not allow… 

Figures and Tables from this paper

The noise component in model-based clustering.

TLDR
This thesis introduces a model, which is a finite mixture of location-scale distributions mixed with a finite number of uniforms supported on disjoint subsets of the data range, and defines a maximum likelihood type estimator for such a model and study its asymptotic behaviour.

Bayesian Non-parametric Parsimonious Gaussian Mixture for Clustering

TLDR
A new Bayesian non-parametric parsimonious mixture model based on an Infinite Gaussian mixture model with an Eigen value decomposition of the covariance matrix of each cluster, and a Chinese Restaurant Process (CRP) prior over the hidden partition is proposed.

Clustering using objective functions and stochastic search

TLDR
A stochastic search algorithm that is driven by a Markov chain that is a mixture of two Metropolis–Hastings algorithms—one that makes small scale changes to individual objects and another that performs large scale moves involving entire clusters.

Clustering with the multivariate normal inverse Gaussian distribution

Constrained Optimization for a Subset of the Gaussian Parsimonious Clustering Models

TLDR
A subset of the GPCM family is considered for model-based clustering, where a re-parameterized version of the famous eigenvalue decomposition of the component covariance matrices is used.

Gaussian parsimonious clustering models

Clustering Based on a Multi-layer Mixture Model

TLDR
This paper explores in this paper a clustering approach that models each cluster by a mixture of normals, and develops algorithms to estimate the model and perform clustering based on the classificationmaximum likelihood (CML) and mixture maximum likelihood (MML) criteria.

Identifying connected components in Gaussian finite mixture models for clustering

  • L. Scrucca
  • Computer Science
    Comput. Stat. Data Anal.
  • 2016

Model-Based Clustering With Dissimilarities: A Bayesian Approach

TLDR
The method carries out multidimensional scaling and model-based clustering simultaneously, and yields good object configurations and good clustering results with reasonable measures of clustering uncertainties, and can be used as a tool for dimension reduction when clustering high-dimensional objects.

Model-based Clustering with Dissimilarities : A Bayesian Approach 1

TLDR
The method carries out multidimensional scaling and model-based clustering simultaneously, and yields good object configurations and good clustering results with reasonable measures of clustering uncertainties, and can be used as a tool for dimension reduction when clustering high-dimensional objects.
...

References

SHOWING 1-10 OF 45 REFERENCES

Multivariate Clustering Procedures with Variable Metrics

In this work we discuss several methods used in the clustering of obj ects which can be represented as points in Euclidean space. Moreover, only those procedures are considered that lead to partition

Estimating the components of a mixture of normal distributions

SUMMARY The problem of estimating the components of a mixture of two normal distributions, multivariate or otherwise, with common but unknown covariance matrices is examined. The maximum likelihood

389: Separating Mixtures of Normal Distributions

Suppose a set of data is believed to have been drawn from a population that consists of a known number of k-variate normal distributions, with the same, unknown, dispersion matrix but different

Asymptotic behaviour of classification maximum likelihood estimates

SUMMARY This paper examines maximum likelihood techniques as applied to classification and clustering problems, and shows that the classification maximum likelihood technique, in which individual

9 The classification and mixture maximum likelihood approaches to cluster analysis

  • G. McLachlan
  • Mathematics
    Classification, Pattern Recognition and Reduction of Dimensionality
  • 1982

Inference and Prediction for a General Order Statistic Model with Unknown Population Size.

Abstract Suppose that the first n order statistics from a random sample of N positive random variables are observed, where N is unknown. This, the general order statistic model, has been applied to

Bayes Factors and Choice Criteria for Linear Models

SUMMARY Global and local Bayes factors are defined and their respective roles examined as choice criteria among alternative linear models. The global Bayes factor is seen to function, in appropriate

PATTERN CLUSTERING BY MULTIVARIATE MIXTURE ANALYSIS.

  • J. Wolfe
  • Mathematics
    Multivariate behavioral research
  • 1970
TLDR
The maximum-likelihood theory and numerical solution techniques are developed for a fairly general class of distributions and the feasibility of the procedures is demonstrated by two examples of computer solutions for normal mixture models of the Fisher Iris data and of artify generated clusters with unequal covariance matrices.

On Some Invariant Criteria for Grouping Data

TLDR
This paper attacks the problem of exploring the structure of multivariate data in search of “clusters” by using a computer procedure to obtain the “best” partition of n objects into g groups.

A Monte Carlo Study of the Sampling Distribution of the Likelihood Ratio for Mixtures of Multinormal Distributions

Abstract : Samples from spherical normal distributions were generated and fitted to hypothesized mixtures of normal distributions using the 360 NORMIX computer program for maximum likelihood