Multiple sequence alignment using Probcons and Probalign.

  title={Multiple sequence alignment using Probcons and Probalign.},
  author={Usman Roshan},
  journal={Methods in molecular biology},
  • Usman Roshan
  • Published 2014
  • Biology
  • Methods in molecular biology
Sequence alignment remains a fundamental task in bioinformatics. The literature contains programs that employ a host of exact and heuristic strategies available in computer science. Probcons was the first program to construct maximum expected accuracy sequence alignments with hidden Markov models and at the time of its publication achieved the highest accuracies on standard protein multiple alignment benchmarks. Probalign followed this strategy except that it used a partition function approach… 

Multiple Sequence Alignment: Methods and Protocols

Clustal Omega is a version, completely rewritten and revised in 2011, of the widely used Clustal series of programs for multiple sequence alignment. It can deal with very large numbers (many tens of

Generalized Bootstrap Supports for Phylogenetic Analyses of Protein Sequences Incorporating Alignment Uncertainty

Unistrap, a novel approach that estimates the combined effect of alignment uncertainty and site sampling on phylogenetic tree branch supports, provides branch support estimates that take into account a larger fraction of the parameters impacting tree instability when processing datasets containing a large number of sequences.

QuanTest2: benchmarking multiple sequence alignments using secondary structure prediction

An improved strategy for selecting reference and non-reference sequences for a new benchmark, QuanTest2 is developed and SSPA and SP correlate better on an alignment by alignment basis than in QuanTest.

The application of binary path matrix in backtracking of sequences alignment

This paper proposed an optimization algorithm based on the backtracking of the binary path matrix which is suitable for both global double sequence alignment and local sequence alignment which can find more bases which are the same ones and provide better basis for analyzing the similarity of sequences.

Seaview Version 5: A Multiplatform Software for Multiple Sequence Alignment, Molecular Phylogenetic Analyses, and Tree Reconciliation.

Seaview version 5 introduces the ability to reconcile a gene tree with a reference species tree and use this reconciliation to root and rearrange the gene tree.

Synopsis of the SOFL Plant-Specific Gene Family

Overall, it is reported that SOFLs are a plant-specific gene family characterized by two conserved domains that are important for function.

Chromosome-level reference genome and alternative splicing atlas of moso bamboo (Phyllostachys edulis)

A chromosome-level de novo genome assembly of moso bamboo is provided using additional abundance sequencing data and a Hi-C scaffolding strategy, and it is observed that the AS genes are concentrated among more conserved genes that tend to accumulate higher transcript levels and share less tissue specificity.

Loop 1 of APOBEC3C regulates its antiviral activity against HIV-1

Replacing two residues in loop 1 of A3C protein with conserved positively-charged amino acids enhance the substrate DNA binding, which markedly facilitates its deamination-dependent antiviral activity against HIV-1 as well as increasing the restriction of LINE-1 retroelements.



ProbCons: Probabilistic consistency-based multiple sequence alignment.

This paper presents ProbCons, a practical tool for progressive protein multiple sequence alignment based on probabilistic consistency, and evaluates its performance on several standard alignment benchmark data sets.

Probalign: multiple sequence alignment using partition function posterior probabilities

The results indicate that Probalign alignments are generally more accurate than other leading multiple sequence alignment methods (i.e. Probcons, MAFFT and MUSCLE) on the BAliBASE 3.0 protein alignment benchmark and outperforms these methods on the HOMSTRAD and OXBENCH benchmarks.

Recent progress in multiple sequence alignment: a survey.

In this review, existing techniques are described and the potential strengths and weaknesses of the most widely used multiple alignment packages are exposed.

MUSCLE: multiple sequence alignment with high accuracy and high throughput.

  • R. Edgar
  • Computer Science
    Nucleic acids research
  • 2004
MUSCLE is a new computer program for creating multiple alignments of protein sequences that includes fast distance estimation using kmer counting, progressive alignment using a new profile function the authors call the log-expectation score, and refinement using tree-dependent restricted partitioning.

A protein alignment scoring system sensitive at all evolutionary distances

  • S. Altschul
  • Biology, Computer Science
    Journal of Molecular Evolution
  • 2004
This paper formalizes this concept by defining a scoring system that is sensitive at all detectable evolutionary distances, and shows that for a typical protein database search, estimating the originally unknown evolutionary distance appropriate to each alignment costs slightly over two bits of information, or somewhat less than a factor of five in statistical significance.

A comprehensive comparison of multiple sequence alignment programs

This paper presents the first systematic study of the most commonly used alignment programs using BAliBASE benchmark alignments as test cases, and proposes appropriate alignment strategies, depending on the nature of a particular set of sequences.

Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids

This book gives a unified, up-to-date and self-contained account, with a Bayesian slant, of such methods, and more generally to probabilistic methods of sequence analysis.

PLAST-ncRNA: Partition function Local Alignment Search Tool for non-coding RNA sequences

This server takes as input a query RNA sequence and a large genome sequence, and outputs a list of hits that are above a mean posterior probability threshold, and is presented in a format suited to local alignment.

Searching for evolutionary distant RNA homologs within genomic sequences using partition function posterior probabilities

It is demonstrated, for the first time, that partition function match probabilities used for expected accuracy alignment, as done in Probalign, provide statistically significant improvement over current approaches for identifying distantly related RNA sequences in larger genomic segments.

BAliBASE 3.0: Latest developments of the multiple sequence alignment benchmark

The latest release of the most widely used multiple alignment benchmark, BAliBASE, which provides high quality, manually refined, reference alignments based on 3D structural superpositions is presented, including new, more challenging test cases, representing the real problems encountered when aligning large sets of complex sequences.