Comparative analysis of the quality of a global algorithm and a local algorithm for alignment of two sequences

  title={Comparative analysis of the quality of a global algorithm and a local algorithm for alignment of two sequences},
  author={Valery Polyanovsky and Mikhail A. Roytberg and Vladimir G. Tumanyan},
  journal={Algorithms for Molecular Biology : AMB},
  pages={25 - 25}
BackgroundAlgorithms of sequence alignment are the key instruments for computer-assisted studies of biopolymers. Obviously, it is important to take into account the "quality" of the obtained alignments, i.e. how closely the algorithms manage to restore the "gold standard" alignment (GS-alignment), which superimposes positions originating from the same position in the common ancestor of the compared sequences. As an approximation of the GS-alignment, a 3D-alignment is commonly used not quite… 

Estimation of the quality of global alignment of amino acid sequences based on evolution criterion

The aim of the work is to develop a common method for estimating the pairwise alignment quality versus the evolutionary distance (degree of homology) between the sequences being compared and versus

The ranging of amino acids substitution matrices of various types in accordance with the alignment accuracy criterion

The best alignment quality is achieved with evolutionary matrices designed for long distances: Gonnet, VTML250, PAM250, MIQS, and Pfasum050, i.e. suitable for aligning sequences separated by both large and small evolutionary distances.

ExtendAlign: a computational algorithm for delivering multiple, local end-to-end alignments

ExtendAlign is a computational tool that extends the alignment achieved by local MSATs and provides an end-to-end report of true m/mm for every hit in each query sequence, reducing the aforementioned alignment biases.

Rapid and Sensitive Alignment-free DNA Sequence Comparison

It is concluded that certain variations on the implementation of kmers, such as mismatches and counting the number of occurrences of each word, might improve sensitivity and precision in certain instances, but that no k-mer implementation is universally best for the different data sets the authors have tested.

Using alignment-free methods as preprocessing stage to classification whole genomes

Alignment-free methods have overcome the challenges of alignment-based methods for measuring the distance between sequences, and the size of the data used is 1000 genomes uploaded from National Center for Biotechnology Information (NCBI), which becomes 860 genomes, ready to be segmented into words by the k-mer analysis.

ExtendAlign: the post-analysis tool to correct and improve the alignment of dissimilar short sequences

ExtendAlign is useful for pinpointing the identity percentage for alignments of short sequences in the range of ∼35–50% similarity and outperforms the other aligners in most metrics tested.

Protein database search of hybrid alignment algorithm based on GPU parallel acceleration

An optimized protein database search method is presented and tested with Swiss-Prot database on graphic processing unit (GPU) devices, and the power of CPU multi-threaded computing is also involved to realize a GPU-based heterogeneous parallelism.

Implementation of Hybrid Alignment Algorithm for Protein Database Search on the SW26010 Many-Core Processor

This paper designs hybrid sequence alignment by combining the Smith-Waterman local alignment algorithm and the Needleman-Wunsch global alignment algorithm, and presents an efficient method of protein database search based on Sunway TaihuLight supercomputer.

Tools and Methods in the Analysis of Simple Sequences

This chapter provides insight into different bioinformatics tools and algorithms along with some basic examples and covers the essential topics of sequence analysis for the ease of readers to understand and implement in their regular work.

Novel Methods for Analyzing and Visualizing Phylogenetic Placements

Novel methods for preand post-processing, analysis, and visualization of phylogenetic placement of sequences are developed, including a method to automatically obtain a suitable reference tree to be used for placement.



Reconstruction of Genuine Pair-Wise Sequence Alignment

To determine the quality of an algorithm, using sequences that were artificially generated in accordance with an appropriate evolution model, the approach was applied to the global version of the Smith-Waterman algorithm (SWA).

Information on the secondary structure improves the quality of protein sequence alignment

The objective of this work was to develop a new and more accurate algorithm taking the secondary structure of proteins into account, and the alignments generated by this algorithm and having the maximal weight with thesecondary structure considered proved to be more accurate than SW alignments.

From analysis of protein structural alignments toward a novel approach to align protein sequences

This work investigated correspondence between “gold standard” alignments of 3D protein structures and the sequence alignments produced by the Smith–Waterman algorithm, currently the most sensitive method for pair‐wise alignment of sequences, and suggested an alternative hierarchical algorithm, which explicitly addresses high scoring regions.

Structure-based evaluation of sequence comparison and fold recognition alignment accuracy.

A benchmark protocol to estimate sequence-to-sequence and sequence- to-structure alignment accuracy and it is estimated that the best results can be obtained from a combination of amino acid residue substitution matrices and knowledge-based potentials.

A novel approach to local reliability of sequence alignments

A novel approach is presented that attributes a reliability index to every pair of residues, including gapped regions, in the optimal alignment of two protein sequences, based on a fuzzy recast of the dynamic programming algorithm for sequence alignment in terms of mean field annealing.

Dynamic use of multiple parameter sets in sequence alignment

An alignment algorithm to allow dynamic use of multiple parameter sets with different levels of stringency in computation of an optimal alignment of two sequences is described.

An assessment of amino acid exchange matrices in aligning protein sequences: the twilight zone revisited.

The relationship between the percentage identity in a resulting alignment and the level of correctness to be expected are given for the top-performing matrix, resulting in a better definition of the so-called "twilight zone".

A generalized global alignment algorithm

A generalized global alignment algorithm for comparing sequences with intermittent similarities, an ordered list of similar regions separated by different regions, which is implemented as a computer program named GAP3 (Global Alignment Program Version 3).

Basic local alignment search tool.

A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins

A computer adaptable method for finding similarities in the amino acid sequences of two proteins has been developed. From these findings it is possible to determine whether significant homology