#### Filter Results:

- Full text PDF available (82)

#### Publication Year

1976

2017

- This year (5)
- Last 5 years (50)
- Last 10 years (91)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Key Phrases

Learn More

- Christophe Ambroise, Geoffrey J McLachlan
- Proceedings of the National Academy of Sciences…
- 2002

In the context of cancer diagnosis and treatment, we consider the problem of constructing an accurate prediction rule on the basis of a relatively small number of tumor tissue samples of known type containing the expression data on very many (possibly thousands) genes. Recently, results have been presented in the literature suggesting that it is possible to… (More)

- Xindong Wu, Vipin Kumar, +11 authors Dan Steinberg
- Knowledge and Information Systems
- 2007

This paper presents the top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM) in December 2006: C4.5, k-Means, SVM, Apriori, EM, PageRank, AdaBoost, kNN, Naive Bayes, and CART. These top 10 algorithms are among the most influential data mining algorithms in the research community. With each algorithm, we provide… (More)

- David Peel, Geoffrey J. McLachlan
- Statistics and Computing
- 2000

Normal mixture models are being increasingly used to model the distributions of a wide variety of random phenomena and to cluster sets of continuous multivariate data. However, for a set of data containing a group or groups of observations with longer than normal tails or atypical observations, the use of normal components may unduly affect the fit of the… (More)

- Geoffrey J. McLachlan, Richard Bean, David Peel
- Bioinformatics
- 2002

MOTIVATION
This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space… (More)

- Saumyadipta Pyne, Xinli Hu, +9 authors Jill P. Mesirov
- RECOMB
- 2009

Flow cytometric analysis allows rapid single cell interrogation of surface and intracellular determinants by measuring fluorescence intensity of fluorophore-conjugated reagents. The availability of new platforms, allowing detection of increasing numbers of cell surface markers, has challenged the traditional technique of identifying cell populations by… (More)

- Shu-Kay Ng, Geoffrey J. McLachlan, Kui Wang, Liat Ben-Tovim Jones, S.-W. Ng
- Bioinformatics
- 2006

MOTIVATION
The clustering of gene profiles across some experimental conditions of interest contributes significantly to the elucidation of unknown gene function, the validation of gene discoveries and the interpretation of biological processes. However, this clustering problem is not straightforward as the profiles of the genes are not all independently… (More)

- Jangsun Baek, Geoffrey J. McLachlan, Lloyd K. Flack
- IEEE Transactions on Pattern Analysis and Machine…
- 2010

Mixtures of factor analyzers enable model-based density estimation to be undertaken for high-dimensional data, where the number of observations n is not very large relative to their dimension p. In practice, there is often the need to further reduce the number of parameters in the specification of the component-covariance matrices. To this end, we propose… (More)

- Geoffrey J. McLachlan, Richard Bean, Liat Ben-Tovim Jones
- Computational Statistics & Data Analysis
- 2007

- Geoffrey J. McLachlan, David Peel, Richard Bean
- Computational Statistics & Data Analysis
- 2003

We focus on mixtures of factor analyzers from the perspective of a method for model-based density estimation from high-dimensional data, and hence for the clustering of such data. This approach enables a normal mixture model to be 5tted to a sample of n data points of dimension p, where p is large relative to n. The number of free parameters is controlled… (More)

- Geoffrey J. McLachlan, David Peel
- SSPR/SPR
- 1998

Normal mixture models are being increasingly used as a way of clustering sets of continuous multivariate data. They provide a proba-bilistic (soft) clustering of the data in terms of their tted posterior probabilities of membership of the mixture components corresponding to the clusters. An outright (hard) clustering can be subsequently obtained by… (More)