MAVL/StickWRLD for protein: visualizing protein sequence families to detect non-consensus features

Abstract

A fundamental problem with applying Consensus, Weight-Matrix or hidden Markov models as search tools for biosequences is that there is no way to know, from the model, if the modeled sequences display any dependencies between positional identities. In some instances, these dependencies are crucial in correctly accepting or rejecting other sequences as members of the family. MAVL (multiple alignment variation linker) and StickWRLD provide a web-based method to visually survey the model-training sequences to discover and characterize possible dependencies. Initially introduced for nucleic acid sequences, with MAVL/StickWRLD, it is easy to distinguish typical DNA or RNA structural dependencies in input families, identify mixed populations of distinct subfamilies, or discover novel dependencies that result from binding interactions or other selective pressures [W. Ray (2004) Nucleic Acids Res., 32, W59-W63]. Since the announcement of MAVL/StickWRLD for nucleic acids, one of the most requested new features has been the extension of this visualization method to support protein alignments. We are pleased to report that this extension has been successful, that the basic visualization has been augmented in several ways to enhance protein viewing, and that the results with protein alignments are even more dramatic than with NA alignments. MAVL/StickWRLD can be accessed at http://www.microbial-pathogenesis.org/stickwrld/.

DOI: 10.1093/nar/gki374

Extracted Key Phrases

4 Figures and Tables

Cite this paper

@article{Ray2005MAVLStickWRLDFP, title={MAVL/StickWRLD for protein: visualizing protein sequence families to detect non-consensus features}, author={William C. Ray}, journal={Nucleic Acids Research}, year={2005}, volume={33}, pages={W315 - W319} }