Learn More
Protein phosphorylation is a post-translational modification performed by a group of enzymes known as the protein kinases or phosphotransferases (Enzyme Commission classification 2.7). It is essential to the correct functioning of both proteins and cells, being involved with enzyme control, cell signalling and apoptosis. The major problem when attempting(More)
Feature selection is a key step in Quantitative Structure Activity Relationship (QSAR) analysis. Chance correlations and multicollinearity are two major problems often encountered when attempting to find generalized QSAR models for use in drug design. Optimal QSAR models require an objective variable relevance analysis step for producing robust classifiers(More)
BACKGROUND MicroRNAs (miRNAs) constitute a class of single-stranded RNAs which play a crucial role in regulating development and controlling gene expression by targeting mRNAs and triggering either translation repression or messenger RNA (mRNA) degradation. miRNAs are widespread in eukaryotes and to date over 14,000 miRNAs have been identified by(More)
MOTIVATION The current paradigm for viewing metabolism, such as the Boehringer Chart or KEGG, takes a metabolite-centric view that is not ideal for genomics analysis because the same enzyme can appear in multiple places. Therefore an enzyme-centric view is also required. RESULTS We have eliminated synonymous compound names taken from the ENZYME database(More)
BACKGROUND Single amino acid repeats make up a significant proportion in all of the proteomes that have currently been determined. They have been shown to be functionally and medically significant, and are associated with cancers and neuro-degenerative diseases such as Huntington's Chorea, where a poly-glutamine repeat is responsible for causing the(More)
Accurately predicting phosphorylation sites in proteins is an important issue in postgenomics, for which how to efficiently extract the most predictive features from amino acid sequences for modeling is still challenging. Although both the distributed encoding method and the bio-basis function method work well, they still have some limits in use. The(More)
BACKGROUND Microsatellites have been used extensively in the field of comparative genomics. By studying microsatellites in coding regions we have a simple model of how genotypic changes undergo selection as they are directly expressed in the phenotype as altered proteins. The simplest of these tandem repeats in coding regions are the tri-nucleotide repeats(More)
MOTIVATION In order to design effective HIV inhibitors, studying and understanding the mechanism of HIV protease cleavage specification is critical. Various methods have been developed to explore the specificity of HIV protease cleavage activity. However, success in both extracting discriminant rules and maintaining high prediction accuracy is still(More)
The lack of reproducibility with animal phenotyping experiments is a growing concern among the biomedical community. One contributing factor is the inadequate description of statistical analysis methods that prevents researchers from replicating results even when the original data are provided. Here we present PhenStat--a freely available R package that(More)
BACKGROUND When unaccounted-for group-level characteristics affect an outcome variable, traditional linear regression is inefficient and can be biased. The random- and fixed-effects estimators (RE and FE, respectively) are two competing methods that address these problems. While each estimator controls for otherwise unaccounted-for effects, the two(More)