CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.

  title={CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.},
  author={J Drew Thompson and Desmond G. Higgins and Toby J. Gibson},
  journal={Nucleic acids research},
  volume={22 22},
The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved for the alignment of divergent protein sequences. Firstly, individual weights are assigned to each sequence in a partial alignment in order to down-weight near-duplicate sequences and up-weight the most divergent ones. Secondly, amino acid substitution matrices are varied at different alignment stages according to the divergence of the sequences to be aligned. Thirdly, residue-specific… 

Figures and Tables from this paper

Improvement of clustal-derived sequence alignments with evolutionary algorithms

Previous efforts using evolutionary algorithms (EAs) for MSA were extended and three new alignment operators were introduced and tested within the framework of protein sequence alignment, showing the degree to which EAs can enhance the results of Clustal X.

Introducing Variable Gap Penalties into Three-Sequence Alignment for Protein Sequences

An algorithm to find a global and optimal alignment among three protein sequences by using position- specific gap penalties which allow gap penalties to be varied and residue-dependent information and protein structure information can be applied to the three-sequence alignment.

Revisiting gap locations in amino acid sequence alignments and a proposal for a method to improve them by introducing solvent accessibility

The relocation of gaps by the new method made a significant improvement in comparative modeling, exemplified here by the Bacillus subtilis yitF protein.

Refining multiple sequence alignments with conserved core regions

A new algorithm is presented, REFINER, that refines a multiple sequence alignment by iterative realignment of its individual sequences with the predetermined conserved core (block) model of a protein family.

Local Weighting Schemes for Protein Multiple Sequence Alignment

Multiple sequence alignment based on dynamic weighted guidance tree

  • K. NguyenYi Pan
  • Computer Science, Biology
    Int. J. Bioinform. Res. Appl.
  • 2011
This study proposes a new multiple sequence strategy that extracts sequence information, sequence global and local similarities to provide different weights for each input sequence, and calculates a weighted pair-wise distance matrix to build a dynamic alignment guiding tree.

Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments.

  • O. Gotoh
  • Biology
    Journal of molecular biology
  • 1996
The improvement in accuracy of alignments obtained by these iterative methods over pairwise or progressive method tends to increase with decreasing average sequence identity, implying that iterative refinement is more effective for the generally difficult alignment of remotely related sequences.

Improvement in accuracy of multiple sequence alignment using novel group-to-group sequence alignment algorithm with piecewise linear gap cost

A novel group-to-group sequence alignment algorithm that uses a piecewise linear gap cost and can construct accurate alignments comparable to the most accurate programs currently available, including L-INS-i of MAFFT, ProbCons, and T-Coffee.



Nucleic acid and protein sequence analysis : a practical approach


  • F. Maytag
  • Computer Science
    Arch. Mus. Informatics
  • 1996


  • Science
  • 1993


  • EMBO J
  • 1993


  • ASV_vSRC

J. Mol. Biol

  • J. Mol. Biol
  • 1987

Nucleic Acids Res. Felsenstein, J. Cladistics

  • Nucleic Acids Res. Felsenstein, J. Cladistics
  • 1984

Proc. Natl. Acad. Sci. USA 90

  • Proc. Natl. Acad. Sci. USA 90
  • 1993

Protein Sci

  • 1973