author={J. Shu and Yajing Li},
  journal={Journal of Biological Systems},
  • J. Shu, Yajing Li
  • Published 2010
  • Mathematics, Biology
  • Journal of Biological Systems
A hypercomplex representation of DNA is proposed to facilitate comparing DNA sequences with fuzzy composition. With the hypercomplex number representation, the conventional sequence analysis method, such as, dot matrix analysis, dynamic programming, and cross-correlation method have been extended and improved to align DNA sequences with fuzzy composition. The hypercomplex dot matrix analysis can provide more control over the degree of alignment desired. A new scoring system has been proposed to… Expand

Figures and Tables from this paper

DNA Sequence Representation and Comparison Based on Quaternion Number System
In the proposed method, the quaternion cross-correlation operation can be used to obtain both the global and local matching/mismatching information between two DNA sequences from the depicted one-dimensional curve and two-dimensional pattern. Expand
Identification of DNA Motif with Mutation
  • J. Shu
  • Computer Science
  • ICCS
  • 2015
The objective is to reduce the number of ambiguous false positives encountered in the DNA motif searching, thereby making the process more efficient for biologists to use. Expand
Lecture Notes in Computer Science: Multiple DNA Sequence Alignment Using Joint Weight Matrix
This paper addresses the issue of ambiguities in selecting the best alignment of multiple sequence alignment by introducing the concept of joint weight matrix to eliminate the randomness. Expand
DNA sequencing using optical joint Fourier transform
The proposed optical approach facilitates the exhaustive search algorithms for locally and/or globally DNA alignment and is capable to search for similarity/dissimilarity between two tested DNA sequences. Expand
Lecture Notes in Computer Science: Multiple DNA Sequence Alignment Using Joint
This paper addresses the issue of ambiguities in selecting the best alignment of multiple sequence alignment by introducing the introduction of joint weight matrix to eliminate the randomness. Expand
The genetic code, 8-dimensional hypercomplex numbers and dyadic shifts
Hadamard matrices and orthogonal systems of Rademacher and Walsh functions participate in this discovery of hidden structural features of the genetic code with dyadic shifts and algebras of 8-dimensional hypercomplex numbers. Expand
Identifying DNA motifs based on match and mismatch alignment information
  • J. Shu, K. Yong
  • Computer Science, Biology
  • Journal of Mathematical Chemistry
  • 2013
A novel scoring system has been introduced by taking both match and mismatch alignment information into account, which has successfully identified a correct TATA box site in Homo sapiens$$H4/g$$ gene. Expand
An Improved Scoring Matrix for Multiple Sequence Alignment
This paper introduces the concept of joint weight matrix to eliminate the randomness in selecting the best multiple sequence alignment and can be easily implemented by use of the improved scoring matrix. Expand
Matrix Genetics and Algebraic Properties of the Multi-Level System of Genetic Alphabets
Algebraic properties of the multi-level system of molecular-genetic alphabets, taking into account the important role of dyadic shifts, hypercomplex numbers and Hadamard matrices, testify that living matter has a profound algebraic essence which is interconnected with 8-dimensional vector spaces. Expand
Fourier-based classification of protein secondary structures.
  • J. Shu, K. Yong
  • Biology, Medicine
  • Biochemical and biophysical research communications
  • 2017
A technique for the classification of protein secondary structures based on protein "signal-plotting" and the use of the Fourier technique for digital signal processing is presented and results show that the more types ofprotein secondary structures can be classified by means of these newly-proposed indices. Expand


Pairwise alignment of the DNA sequence using hypercomplex number representation
A new scoring system has been proposed to suit the hypercomplex number representation of the DNA base-nucleic acid codes and incorporated with the method of dot matrix analysis and various algorithms of sequence alignment. Expand
Frequency-domain analysis of biomolecular sequences
An optimization procedure improving upon traditional Fourier analysis performance in distinguishing coding from noncoding regions in DNA sequences is provided and it is demonstrated that color spectrograms can visually provide significant information about biomolecular sequences, thus facilitating understanding of local nature, structure and function. Expand
Fast Fourier transform-based correlation of DNA sequences using complex plane encoding
It is reported that, through the use of alternative encodings of the DNA sequence in the complex plane, the number of FFTs performed can be traded off against (i) signal-to-noise ratio, and (ii) a certain degree of filtering for local similarity via k-tuple correlation. Expand
Long-range correlations in nucleotide sequences
This work proposes a method for studying the stochastic properties of nucleotide sequences by constructing a 1:1 map of the nucleotide sequence onto a walk, which it refers to as a 'DNA walk', and uncovers a remarkably long-range power law correlation. Expand
Conversion of nucleotides sequences into genomic signals
  • P. Cristea
  • Biology, Medicine
  • Journal of cellular and molecular medicine
  • 2002
An original tetrahedral representation of the Genetic Code that better describes its structure, degeneration and evolution trends is defined and it is shown that some essential features of the nucleotide sequences can be better extracted using this representation. Expand
An efficient method for matching nucleic acid sequences
Though the objective achieved is of limited interest, this method will complement algorithms for efficiently finding the longest matching parts of two sequences, and is faster than existing algorithms for finding matches allowing deletions and insertions. Expand
Symbol-balanced quaternionic periodicity transform for latent pattern detection in DNA sequences
  • Andrzej K. Brodzik, O. Peters
  • Mathematics, Computer Science
  • Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005.
  • 2005
The resulting quaternionicperiodicity transform outperforms the previously proposed complex periodicity transform due to enhanced, symbol-balanced sensitivity to DNA patterns. Expand
SIGNAL SCAN: a computer program that scans DNA sequences for eukaryotic transcriptional elements
SIGNAL SCAN is a program that has been developed to aid the molecular biologist in determining what eukaryotic transcription factor elements (and other significant elements) may exist in a DNAExpand
Weight matrix descriptions of four eukaryotic RNA polymerase II promoter elements derived from 502 unrelated promoter sequences.
  • P. Bucher
  • Biology, Medicine
  • Journal of molecular biology
  • 1990
Optimized weight matrices defining four major eukaryotic promoter elements, the TATA-box, cap signal, CCAAT-, and GC-box are presented; they were derived by comparative sequence analysis of 502 unrelated RNA polymerase II promoter regions by a novel algorithm that is generally applicable to sequence motifs positionally correlated with a biologically defined position in the sequences. Expand
Using sequence compression to speedup probabilistic profile matching
This work exploits string compression techniques to speedup brute-force profile matching and presents two algorithms, based on run-length and LZ78 encodings, that reduce computational complexity by the compression factor of the encoding. Expand