T-Coffee: A novel method for fast and accurate multiple sequence alignment.

  title={T-Coffee: A novel method for fast and accurate multiple sequence alignment.},
  author={C. Notredame and Desmond G. Higgins and Jaap Heringa},
  journal={Journal of molecular biology},
  volume={302 1},
We describe a new method (T-Coffee) for multiple sequence alignment that provides a dramatic improvement in accuracy with a modest sacrifice in speed as compared to the most commonly used alternatives. [] Key Result The resulting alignments are significantly more reliable, as determined by comparison with a set of 141 test cases, than any of the popular alternatives that we tried. The improvement, especially clear with the more difficult test cases, is always visible, regardless of the phylogenetic spread of…

Figures and Tables from this paper

A New Approach for Tree Alignment Based on Local Re-Optimization
  • Feng YueJijun Tang
  • Computer Science, Biology
    2008 International Conference on BioMedical Engineering and Informatics
  • 2008
This paper presents a new algorithm to conduct multiple sequences alignment based on phylogenetic trees that can handle both DNA and protein data and can take simple cost model as well as complex substitution matrices, such as PAM or BLOSUM series.
MANGO: a new approach to multiple sequence alignment.
  • Zefeng ZhangHao LinMing Li
  • Computer Science
    Computational systems bioinformatics. Computational Systems Bioinformatics Conference
  • 2007
This paper introduces a novel and orthogonal multiple sequence alignment method, using multiple optimized spaced seeds and new algorithms to handle these seeds efficiently, avoiding problems caused by the popular progressive approaches.
Integrated multiple sequence alignment
A technique is established that can accurately align sequences containing eventually repeated motifs that can compare tandem repeat sequences by aligning them with respect to their possible repeat histories.
T-Coffee: Tree-based consistency objective function for alignment evaluation.
This chapter presents how the T-Coffee package can be used in its command line mode to carry out the most common tasks and multiply align proteins, DNA, and RNA sequences.
Grammar-based distance in progressive multiple sequence alignment
The proposed multiple sequence alignment algorithm has successfully built multiple alignments comparable to other programs with significant improvements in running time and is especially striking for large datasets.
DIALIGN-T: An improved algorithm for segment-based multiple sequence alignment
A complete re-implementation of the segment-based approach to multiple protein alignment that contains a number of improvements compared to the previous version 2.2 of DIALIGN and is comparable to the standard global aligner CLUSTAL W, though it is outperformed by some newly developed programs that focus on global alignment.
A min-cut algorithm for the consistency problem in multiple sequence alignment
This work proposes a graph-theoretical approach to find local multiple sequence similarities that consistently outperforms the standard version of DIALIGN where local pairwise alignments are greedily incorporated into a multiple alignment.
RASCAL: Rapid Scanning and Correction of Multiple Sequence Alignments
The accuracy and reliability of RASCAL is demonstrated using alignments from the BAliBASE benchmark database, where significant improvements were often observed, with no deterioration of the existing high-quality regions.
A Hybrid method for effective multiple sequence alignment
A hybrid algorithm that aims to improve the accuracy of progressive global alignments especially in the case of families including sequences with large NC-terminal extension and its ability to achieve good quality solutions is described.
Segment-based multiple sequence alignment
The main problem is to define segments of the sequences in such a way that a graph-based alignment is possible, and the consistency idea can be extended to align multiple genomic sequences.


COFFEE: an objective function for multiple sequence alignments
It is shown that multiple sequence alignments can be optimized for their COFFEE score with the genetic algorithm package SAGA and given a library of structure-based pairwise alignments extracted from FSSP, SAG a can produce high-quality multiple sequencealignments.
Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments.
  • O. Gotoh
  • Biology
    Journal of molecular biology
  • 1996
The improvement in accuracy of alignments obtained by these iterative methods over pairwise or progressive method tends to increase with decreasing average sequence identity, implying that iterative refinement is more effective for the generally difficult alignment of remotely related sequences.
A comprehensive comparison of multiple sequence alignment programs
This paper presents the first systematic study of the most commonly used alignment programs using BAliBASE benchmark alignments as test cases, and proposes appropriate alignment strategies, depending on the nature of a particular set of sequences.
Combining many multiple alignments in one improved alignment
A method that extracts qualitatively good sub-alignments from a set of multiple alignments and combines these into a new, often improved alignment, implemented as a variant of the traditional dynamic programming technique.
A tool for multiple sequence alignment.
The design and application of a tool for multiple alignment of amino acid sequences that implements a new algorithm that greatly reduces the computational demands of dynamic programming is described.
The multiple sequence alignment problem in biology
It is proved here that knowledge of the measure of an arbitrarily chosen alignment can be used in combination with information from the pairwise alignments to considerably restrict the size of the region of the lattice in consideration.
Improved tools for biological sequence comparison.
  • W. PearsonD. Lipman
  • Biology, Computer Science
    Proceedings of the National Academy of Sciences of the United States of America
  • 1988
Three computer programs for comparisons of protein and DNA sequences can be used to search sequence data bases, evaluate similarity scores, and identify periodic structures based on local sequence similarity.
Extracting protein alignment models from the sequence database.
Investigating distant relationships and merging families into superfamilies further confirms the notion that proteins evolved from relatively few ancient sequences by generating models of these ancient conserved regions for rapid and sensitive screening of sequences.