Learn More
SUMMARY A non-parametric Bayesian factor model is proposed for joint analysis of multi-platform genomics data. The approach is based on factorizing the latent space (feature space) into a shared component and a data-specific component with the dimensionality of these components (spaces) inferred via a beta-Bernoulli process. The proposed approach is(More)
Point process data are commonly observed in fields like healthcare and the social sciences. Designing predictive models for such event streams is an under-explored problem, due to often scarce training data. In this work we propose a multi-task point process model, leveraging information from all tasks via a hierarchical Gaussian process (GP). Nonparametric(More)
We develop a sticky hidden Markov model (HMM) with a Dirichlet distribution (DD) prior, motivated by the problem of analyzing comparative genomic hybridization (CGH) data. As formulated the sticky DD-HMM prior is employed to infer the number of states in an HMM, while also imposing state persistence. The form of the proposed hierarchical model allows(More)
BACKGROUND Genome-wide screening of patients with mental retardation using array comparative genomic hybridisation (CGH) has identified several novel imbalances. With this genotype-first approach, the 2q22.3q23.3 deletion was recently described as a novel microdeletion syndrome. The authors report two unrelated patients with a de novo interstitial deletion(More)
Analysis of biopolymer sequences and structures generally adopts one of two approaches: use of detailed biophysical theoretical models of the system with experimentally-determined parameters, or largely empirical statistical models obtained by extracting parameters from large datasets. In this work, we demonstrate a merger of these two approaches using(More)
We develop a new bayesian construction of the elastic net (ENet), with variational bayesian analysis. This modeling framework is motivated by analysis of gene expression data for viruses, with a focus on H3N2 and H1N1 influenza, as well as Rhino virus and RSV (respiratory syncytial virus). Our objective is to understand the biological pathways responsible(More)
BACKGROUND Chronic lymphocytic leukemia (CLL) is typically regarded as an indolent B-cell malignancy. However, there is wide variability with regards to need for therapy, time to progressive disease, and treatment response. This clinical variability is due, in part, to biological heterogeneity between individual patients' leukemias. While much has been(More)
We propose a mixture model for text data designed to capture underlying structure in the history of present illness section of electronic medical records data. Additionally, we propose a method to induce bias that leads to more homogeneous sets of diagnoses for patients in each cluster. We apply our model to a collection of electronic records from an(More)