David J. Miller

Learn More
We address statistical classifier design given a mixed training set consisting of a small labelled feature set and a (generally larger) set of unlabelled features. This situation arises, e.g., for medical images, where although training features may be plentiful, expensive expertise is required to extract their class labels. We propose a classifier(More)
Joint source-channel decoding based on residual source redundancy is an e ective paradigm for error-resilient data compression. While previous work only considered xed rate systems, the extension of these techniques for variablelength encoded data was recently independently proposed by the authors [6], [7] and by Demir and Sayood [1]. In this paper, we(More)
Estimating the number of components (the order) in a mixture model is often addressed using criteria such as the Bayesian information criterion (BIC) and minimum message length. However, when the feature space is very large, use of these criteria may grossly underestimate the order. Here, it is suggested that this failure is not mainly attributable to the(More)
We propose a new learning algorithm for regression modeling. The method is especially suitable for optimizing neural network structures that are amenable to a statistical description as mixture models. These include mixture of experts, hierarchical mixture of experts (HME), and normalized radial basis functions (NRBF). Unlike recent maximum likelihood (ML)(More)
This review provides a focused summary of the implications of high-dimensional data spaces produced by gene expression microarrays for building better models of cancer diagnosis, prognosis, and therapeutics. We identify the unique challenges posed by high dimensionality to highlight methodological problems and discuss recent methods in predictive(More)
A global optimization method is introduced for the design of statistical classiiers that minimize the rate of misclassiication. We rst derive the theoretical basis for the method, based on which we develop a novel design algorithm and demonstrate its eeectiveness and superior performance in the design of practical classiiers for some of the most popular(More)
MOTIVATION In both genome-wide association studies (GWAS) and pathway analysis, the modest sample size relative to the number of genetic markers presents formidable computational, statistical and methodological challenges for accurately identifying markers/interactions and for building phenotype-predictive models. RESULTS We address these objectives via(More)