• Publications
  • Influence
An information‐theoretic approach to the prediction of protein structural class
An information‐theoretical approach, which combines a sequence decomposition technique and a fuzzy clustering algorithm, is proposed for prediction of protein structural class and it is shown by the Jackknife test that this approach represents an improvement in the prediction of accuracy over existing methods. Expand
New Invariant of DNA Sequences
  • Chun Li, Jun Wang
  • Mathematics, Computer Science
  • J. Chem. Inf. Model.
  • 2005
The utility of the new parameter lambda1, an approximate value of lambda1 and simpler for calculation, is illustrated on the DNA sequences of five species: human, chimpanzee, mouse, rat, and gallus. Expand
3-D maps and coupling numbers for protein sequences
Based on a five-letter model of the 20 amino acids, we propose a new 3-D graphical representation of protein sequence. Then we derive from the graphical representation, numerical indices called 3-DExpand
Phylogenetic analysis of DNA sequences based on k-word and rough set theory
Among alignment-free methods for sequence comparison, the model of k-word frequencies is a well-developed one. However, most existing word-based methods neglect relationships among k-wordExpand
Similarity analysis of protein sequences based on the normalized relative-entropy.
Based on the classification of 20 amino acids, a 12-D vector is constructed to describe the protein primary sequence and the examination of similarities/dissimilarities among eight different proteins illustrates the utility of the approach. Expand
A computational method of predicting regulatory interactions in Arabidopsis based on gene expression data and sequence information
A computational method to predict regulatory interactions in Arabidopsis based on gene expression data and sequence information is introduced and it is suggested that the method can serve as a potential and cost-effective tool for predicting regulatory interactions. Expand
Artificial Neural Network Method for Predicting Protein Coding Genes in the Yeast Genome
The results imply that the current artificial neural network method is a useful computer technique for predicting protein–coding genes, and can be extended to find genes with more complicated structures. Expand
An S-Curve-Based Approach of Identifying Biological Sequences
The S-curve diagram can better represent biological sequences (such as protein’s) within Cartesian coordinate system, and the mutation point of biological sequence and the new standard—the degree of similarity is presented. Expand
A Simple Method for Characterization and Similarity Analysis of DNA Sequences
The main conclusion can draw is that the 3-component vector has captured important features of the DNA sequences, and it can be directly extended to deal with long DNA sequences. Expand
Prediction of success for polymerase chain reactions using the Markov maximal order model and support vector machine.
By means of the Markov maximal order model, a 48-D feature vector is constructed to represent a DNA template and a reliable and efficient method to predict the success of PCR reactions is developed. Expand