William Noble Grundy

Learn More
We introduce a method of functionally classifying genes by using gene expression data from DNA microarray hybridization experiments. The method is based on the theory of support vector machines (SVMs). SVMs are considered a supervised computer learning method because they exploit prior knowledge of gene function to identify unknown genes of similar function(More)
MOTIVATION Modeling families of related biological sequences using Hidden Markov models (HMMs), although increasingly widespread, faces at least one major problem: because of the complexity of these mathematical models, they require a relatively large training set in order to accurately recognize a given family. For families in which there are few known(More)
In our attempts to understand cellular function at the molecular level, we must be able to synthesize information from disparate types of genomic data. We consider the problem of inferring gene functional classifications from a heterogeneous data set consisting of DNA microarray expression measurements and phylogenetic profiles from whole-genome sequence(More)
The function of an unknown biological sequence can often be accurately inferred by identifying sequences homologous to the original sequence. Given a query set of known homologs, there exist at least three general classes of techniques for finding additional homologs: pairwise sequence comparisons, motif analysis, and hidden Markov modeling. Pairwise(More)
In this paper we consider the problem of extracting information from the upstream untranslated regions of genes to make predictions about their transcriptional regulation. We present a method for classifying genes based on motif-based hidden Markov models (HMMs) of their promoter regions. Sequence motifs discovered in yeast promoters are used to construct(More)
An important goal in bioinformatics is determining the homology and function of proteins from their sequences. Pairwise sequence similarity algorithms are often employed for this purpose. This paper describes a method for improving the accuracy of such algorithms using knowledge about families of proteins. The method requires a library of protein families(More)
Spinach CSP41 is part of a protein complex that binds to the 3' untranslated region (UTR) of petD precursor-mRNA, a chloroplast gene encoding subunit IV of the cytochrome b6/f complex. CSP41 cleaves the 3'-UTR of petD mRNA within the stem-loop structure, suggesting a key role in the control of chloroplast mRNA stability. We discovered that CSP41 is(More)
The increasing size of protein sequence databases is straining methods of sequence analysis, even as the increased information offers opportunities for sophisticated analyses of protein structure, function, and evolution. Here we describe a method that uses artificial intelligence-based algorithms to build models of families of protein sequences. These(More)