Learn More
BACKGROUND Nonnegative Matrix Factorization (NMF) is an unsupervised learning technique that has been applied successfully in several fields, including signal processing, face recognition and text mining. Recent applications of NMF in bioinformatics have demonstrated its ability to extract meaningful information from high-dimensional data such as gene(More)
BACKGROUND Keyword searching through PubMed and other systems is the standard means of retrieving information from Medline. However, ad-hoc retrieval systems do not meet all of the needs of databases that curate information from literature, or of text miners developing a corpus on a topic that has many terms indicative of relevance. Several databases have(More)
The quest for the discovery of mathematical principles that underlie biological phenomena is ancient and ongoing. We present a geometric analysis of the complex interdigitated pavement cells in the Arabidopsis thaliana (Col.) adaxial epidermis with a view to discovering some geometric characteristics that may govern the formation of this tissue. More than(More)
MOTIVATION In a diploid organism the proportion of transcripts that are produced from the two parental alleles can differ substantially due, for example to epigenetic modification that causes complete or partial silencing of one parental allele or to cis acting polymorphisms that affect transcriptional regulation. Counts of SNP alleles derived from EST(More)
MOTIVATION Accurate detection of positive Darwinian selection can provide important insights to researchers investigating the evolution of pathogens. However, many pathogens (particularly viruses) undergo frequent recombination and the phylogenetic methods commonly applied to detect positive selection have been shown to give misleading results when applied(More)
The pattern of viral diversification in newly infected individuals provides information about the host environment and immune responses typically experienced by the newly transmitted virus. For example, sites that tend to evolve rapidly across multiple early-infection patients could be involved in enabling escape from common early immune responses, could(More)
One of the most important genetic factors known to affect the rate of disease progression in HIV-infected individuals is the genotype at the Class I Human Leukocyte Antigen (HLA) locus, which determines the HIV peptides targeted by cytotoxic T-lymphocytes (CTLs). Individuals with HLA-B*57 or B*5801 alleles, for example, target functionally important parts(More)
Probabilistic models of sequence evolution are in widespread use in phylogenetics and molecular sequence evolution. These models have become increasingly sophisticated and combined with statistical model comparison techniques have helped to shed light on how genes and proteins evolve. Models of codon evolution have been particularly useful, because, in(More)
Why do highly expressed genes have small introns? This is an important issue, not least because it provides a testing ground to compare selectionist and neutralist models of genome evolution. Some argue that small introns are selectively favoured to reduce the costs of transcription. Alternatively, large introns might permit complex regulation, not needed(More)
Host immune responses against infectious pathogens exert strong selective pressures favouring the emergence of escape mutations that prevent immune recognition. Escape mutations within or flanking functionally conserved epitopes can occur at a significant cost to the pathogen in terms of its ability to replicate effectively. Such mutations come under(More)