Learn More
Prediction of protein structural classes (α, β, α + β and α/β) from amino acid sequences is of great importance, as it is beneficial to study protein function, regulation and interactions. Many methods have been developed for high-homology protein sequences, and the prediction accuracies can achieve up to 90%. However, for low-homology sequences whose(More)
Fold recognition from amino acid sequences plays an important role in identifying protein structures and functions. The taxonomy-based method, which classifies a query protein into one of the known folds, has been shown very promising for protein fold recognition. However, extracting a set of highly discriminative features from amino acid sequences remains(More)
In this paper, we intend to predict protein structural classes (alpha, beta, alpha+beta, or alpha/beta) for low-homology data sets. Two data sets were used widely, 1189 (containing 1092 proteins) and 25PDB (containing 1673 proteins) with sequence homology being 40% and 25%, respectively. We propose to decompose the chaos game representation of proteins into(More)
G-protein-coupled receptors (GPCRs) play a key role in diverse physiological processes and are the targets of almost two-thirds of the marketed drugs. The 3 D structures of GPCRs are largely unavailable; however, a large number of GPCR primary sequences are known. To facilitate the identification and characterization of novel receptors, it is therefore very(More)
Promoter region plays an important role in determining where the transcription of a particular gene should be initiated. Computational prediction of eukaryotic Pol II promoter sequences is one of the most significant problems in sequence analysis. Existing promoter prediction methods are still far from being satisfactory. We attempt to recognize the human(More)
The construction of consensus genetic maps is a very challenging problem in computational biology. Many computational approaches have been proposed on the basis of only the marker order relations provided by a given set of individual genetic maps. In this article, we propose a comparative approach to constructing consensus genetic maps for a genome, which(More)
The additive model is a semiparametric class of models that has become extremely popular because it is more flexible than the linear model and can be fitted to high-dimensional data when fully nonparametric models become infeasible. We consider the problem of simultaneous variable selection and parametric component identification using spline approximation(More)
Using six kinds of lattice types (4 x 4, 5 x 5, and 6 x 6 square lattices; 3 x 3 x 3 cubic lattice; and 2+3+4+3+2 and 4+5+6+5+4 triangular lattices), three different size alphabets (HP, HNUP, and 20 letters), and two energy functions, the designability of protein structures is calculated based on random samplings of structures and common biased sampling(More)
Prediction of protein contact map is of great importance since it can facilitate and improve the prediction of protein 3D structure. However, the prediction accuracy is notoriously known to be rather low. In this paper, a consensus contact map prediction method called LRcon is developed, which combines the prediction results from several complementary(More)
  • 1