Adrian E. Raftery

Learn More
Standard statistical practice ignores model uncertainty. Data analysts typically select a model from some class of models and then proceed as if the selected model had generated the data. This approach ignores the uncertainty in model selection, leading to over-confident inferences and decisions that are more risky than one thinks they are. Bayesian model(More)
We consider the problem of determining the structure of clustered data, without prior knowledge of the number of clusters or any other information about their composition. Data are represented by a mixture model in which each component corresponds to a different cluster. Models with varying geometric properties are obtained through Gaussian components with(More)
The classification maximum likelihood approach is sufficiently general to encompass many current clustering algorithms, including those based on the sum of squares criterion and on the criterion of Friedman and Rubin (1967). However, as currently implemented, it does not allow the specification of which features (orientation, size and shape) are to be(More)
MOTIVATION Clustering is a useful exploratory technique for the analysis of gene expression data. Many different heuristic clustering algorithms have been proposed in this context. Clustering algorithms based on probability models offer a principled alternative to heuristic algorithms. In particular, model-based clustering assumes that the data is generated(More)
We introduce the weighted likelihood bootstrap (WLB) as a simple way of approximately simulating from a posterior distribution. This is easy to implement, requiring only an algorithm for calculating the maximum likelihood estimator, such as the EM algorithm or iteratively reweighted least squares; it does not necessarily require actual calculation of the(More)
We consider the problem of model selection and accounting for model uncertainty in high dimensional contingency tables motivated by expert system applications The approach most used currently is a stepwise strategy guided by tests based on approxi mate asymptotic P values leading to the selection of a single model inference is then conditional on the(More)
We consider the problem of accounting for model uncertainty in linear regression models. Conditioning on a single selected model ignores model uncertainty, and thus leads to the underestimation of uncertainty when making inferences about quantities of interest. A Bayesian solution to this problem involves averaging over all possible models (i.e.,(More)