Xiaoqi Zheng

Learn More
Computational prediction of protein structural class based solely on sequence data remains a challenging problem in protein science. Existing methods differ in the protein sequence representation models and prediction engines adopted. In this study, a powerful feature extraction method, which combines position-specific score matrix (PSSM) with auto(More)
Knowledge of structural class plays an important role in understanding protein folding patterns. In this study, a simple and powerful computational method, which combines support vector machine with PSI-BLAST profile, is proposed to predict protein structural class for low-similarity sequences. The evolution information encoding in the PSI-BLAST profiles is(More)
An information-theoretical approach, which combines a sequence decomposition technique and a fuzzy clustering algorithm, is proposed for prediction of protein structural class. This approach could bypass the process of selecting and comparing sequence features as done previously. First, distances between each pair of protein sequences are estimated using a(More)
Knowledge of structural class plays an important role in understanding protein folding patterns. So it is necessary to develop effective and reliable computational methods for prediction of protein structural class. To this end, we present a new method called NN-CDM, a nearest neighbor classifier with a complexity-based distance measure. Instead of(More)
Identification of gene-phenotype relationships is a fundamental challenge in human health clinic. Based on the observation that genes causing the same or similar phenotypes tend to correlate with each other in the protein-protein interaction network, a lot of network-based approaches were proposed based on different underlying models. A recent comparative(More)
Apoptosis proteins are very important for understanding the mechanism of programmed cell death. Obtaining information on subcellular location of apoptosis proteins is very helpful to understand the apoptosis mechanism. In this paper, based on amino acid substitution matrix and auto covariance transformation, we introduce a new sequence-based model, which(More)
Protein structure prediction is critical to functional annotation of the massively accumulated biological sequences, which prompts an imperative need for the development of high-throughput technologies. As a first and key step in protein structure prediction, protein structural class prediction becomes an increasingly challenging task. Amongst most(More)
Knowledge of apoptosis proteins plays an important role in understanding the mechanism of programmed cell death. Thus, annotating the function of apoptosis proteins is of significant value. Since the function of apoptosis proteins correlates with their subcellular location, the information about their subcellular location can be very useful in understanding(More)
In this article, we present a new distance metric, the Weighted Sequence Entropy (WSE), based on the short word composition of biological sequences. As a revision of the classical relative entropy (RE), our metric (1) works equivalently with RE in the case of small k, (2) avoids the degeneracy when some word types are absent in one sequence but not in the(More)
A DNA primary sequence is a string consisting of letters on an alphabet Ω = {a, c, g, t}. Based on all of the 2-combinations of the set Ω, here the repetition is allowed, we transform a DNA primary sequence into a special sequence over a set with cardinality 10. With the 10-letter sequence, we associate 10 nonnegative numerical sequences and then derive a(More)