Tianming Wang

Learn More
Information on the structural classes of proteins has been proven to be important in many fields of bioinformatics. Prediction of protein structural class for low-similarity sequences is a challenge problem. In this study, 11 features (including 8 re-used features and 3 newly-designed features) are rationally utilized to reflect the general contents and(More)
Knowledge of structural classes plays an important role in understanding protein folding patterns. In this paper, features based on the predicted secondary structure sequence and the corresponding E-H sequence are extracted. Then, an 11-dimensional feature vector is selected based on a wrapper feature selection algorithm and a support vector machine (SVM).(More)
The objective of this study was to identify the relationship between cadmium (Cd) and stress responses in the clam Mactra veneriformis. Metallothionein (MT) and Cu-Zn superoxide dismutase (SOD) cDNAs from the clam were isolated and characterized. The full-length cDNA of MvMT and MvSOD contained 830 and 689 nucleotides encoding 59 and 159 amino acids,(More)
Sea cucumbers are fascinating invertebrate organisms because of their ability to rapidly regenerate many organs and appendages. In this study 454 cDNA sequencing method was used to characterize transcriptome in Apostichopus japonicus in order to investigate genes that are active in regeneration. Based on sequence similarity with known genes, our analysis(More)
Numerous efficient methods based on word counts for sequence analysis have been proposed to characterize DNA sequences to help in comparison, retrieval from the databases and reconstructing evolutionary relations. However, most of them seem unrelated to any intrinsic characteristics of DNA. In this paper, we proposed a novel statistical measure for sequence(More)
So far, various approaches for phylogenetic analysis have been developed. Almost all of them put stress on analyzing nucleic acid sequences or protein primary structures. In this paper, we take the physicochemical properties of amino acids into account and introduce the hydropathy profile of amino acids into phylogenetic analysis. We find that this(More)
In this paper, we generalize the concept of Riordan array. A generalized Riordan array with respect to cn is an infinite, lower triangular array determined by the pair (g(t), f(t)) and has the generic element dn,k = [t/cn]g(t)(f(t))/ck, where cn is a fixed sequence of non-zero constants with c0 = 1. We demonstrate that the generalized Riordan arrays have(More)
In this study, a simple 4k-dimension feature representation vector is proposed to reconstruct phylogenetic trees, where k is the length of a word. The vector is composed of elements which characterize the relative difference of biological sequence from sequence generated by an independent random process. In addition, the variance of a vector which is(More)
Up to now, various approaches for phylogenetic analysis have been developed. Almost all of them put stress on analyzing nucleic acid sequences or protein primary sequences. In this paper, we propose a new sequence distance for efficient reconstruction of phylogenetic trees based on the distribution of length about common sub-sequences between two sequences.(More)