Learn More
We present a method for condensing the information in multiple alignments of proteins into a mixture of Dirichlet densities over amino acid distributions. Dirichlet mixture densities are designed to be combined with observed amino acid frequencies to form estimates of expected amino acid probabilities at each position in a profile, hidden Markov model or(More)
Molecular profiling studies can generate abundance measurements for thousands of transcripts, proteins, metabolites, or other species in, for example, normal and tumor tissue samples. Treating such measurements as features and the samples as labeled data points, sparse hyperplanes provide a statistical methodology for classifying data points into one of two(More)
A Bayesian method for estimating the amino acid distributions in the states of a hidden Markov model (HMM) for a protein family or the columns of a multiple alignment of that family is introduced. This method uses Dirichlet mixture densities as priors over amino acid distributions. These mixture densities are determined from examination of previously(More)
Transcription profiling experiments permit the expression levels of many genes to be measured simultaneously. Given profiling data from two types of samples, genes that most distinguish the samples (marker genes) are good candidates for subsequent in-depth experimental studies and developing decision support systems for diagnosis, prognosis, and monitoring.(More)
BACKGROUND A genetic network can be represented as a directed graph in which a node corresponds to a gene and a directed edge specifies the direction of influence of one gene on another. The reconstruction of such networks from transcript profiling data remains an important yet challenging endeavor. A transcript profile specifies the abundances of many(More)
Abbreviations 2D two-dimensional, or monolayer cultures 3D three-dimensional cultures of cells embedded in extracellular matrix components BCE-1 bovine casein element 1 BM basement membrane ECM extracellular matrix Introduction A problem in developmental biology that continues to take center stage is how higher organisms generate diverse tissues and organs(More)
Stochastic context-free grammars (SCFGs) can be applied to the problems of folding, aligning and modeling families of homologous RNA sequences. SCFGs capture the sequences' common primary and secondary structure and generalize the hidden Markov models (HMMs) used in related work on protein and DNA. This paper discusses our new algorithm, Tree-Grammar EM,(More)
We describe an exploratory, data-oriented approach for identifying candidates for differential gene expression in cDNA microarray experiments in terms of alpha-outliers and outlier regions, using simultaneous tolerance intervals relative to the line of equivalence (Cy5 = Cy3). We demonstrate the improved performance of our approach over existing(More)