SA-Search: a web tool for protein structure mining based on a Structural Alphabet

  title={SA-Search: a web tool for protein structure mining based on a Structural Alphabet},
  author={Fr{\'e}d{\'e}ric Guyon and Anne-Claude Camproux and Jo{\"e}lle Hochez and Pierre Tuff{\'e}ry},
  journal={Nucleic acids research},
  volume={32 Web Server issue},
SA-Search is a web tool that can be used to mine for protein structures and extract structural similarities. It is based on a hidden Markov model derived Structural Alphabet (SA) that allows the compression of three-dimensional (3D) protein conformations into a one-dimensional (1D) representation using a limited number of prototype conformations. Using such a representation, classical methods developed for amino acid sequences can be employed. Currently, SA-Search permits the performance of… 

Figures and Tables from this paper

Protein structure database search and evolutionary classification
3D-BLAST is as fast as BLAST and calculates the statistical significance of an alignment to indicate the reliability of the prediction, and enhanced BLAST as a search method, using a new structural alphabet substitution matrix (SASM) to find the longest common substructures with high-scoring structured segment pairs from an SADB database.
Protein Block Expert (PBE): a web-based protein structure analysis server using a structural alphabet
This methodology to compare protein structures that are encoded as sequences of PBs by aligning them using dynamic programming which uses a substitution matrix for PBs is implemented in the applications available in Protein Block Expert (PBE).
Protein structure search and local structure characterization
The experiments showed that by transforming the structural representations from 3D to 1D, several 1D-based tools can be applied to structural analysis, including similarity searches and structural motif finding.
Candidate Fragments Prediction and their Assembly with a Greedy Algorithm and a Coarse-Grained Force Field to solve Protein Folding
A method to fold proteins in silico, starting from a HMM based structu ral alphabet which consists of a local 3D description of the structure, and the approach to generate de novo 3D models of proteins is discussed.
mulPBA: an efficient multiple protein structure alignment method based on a structural alphabet
A new web server is proposed called multiple Protein Block Alignment (mulPBA), which implements a method based on a structural alphabet to describe the backbone conformation of a protein chain in terms of dihedral angles, enabling the use of powerful sequence alignment methods for primary structure comparison.
Protein structural similarity search by Ramachandran codes
SARST demonstrates that the easily accessible linear encoding methodology has the potential to serve as a foundation for efficient protein structural similarity search tools that are applicable to automated and high-throughput functional annotations or predictions for the ever increasing number of published protein structures in this post-genomic era.
Kappa-alpha plot derived structural alphabet and BLOSUM-like substitution matrix for rapid search of protein structure database
A novel protein structure database search tool that is useful for analyzing novel structures and can return a ranked list of alignments and employs a kappa-alpha (κ, α) plot derived structural alphabet and a new substitution matrix.
SA-Mot: a web server for the identification of motifs of interest extracted from protein loops
A new web server SA-Mot (Structural Alphabet Motif), which uses the structural word notion to extract all structural motifs from uni-dimensional sequences corresponding to loop structures, so that the users can easily locate loop regions that are important for the protein folding and function.
A substitution matrix for structural alphabet based on structural alignment of homologous proteins and its applications
It is demonstrated that PBs can be efficiently used to detect the lobe/domain flexibility in the multidomain proteins and shown that, in variable regions between two superimposed homologous proteins, one can distinguish between local conformational differences and rigid‐body displacement of a conserved motif by comparing the PBs and their substitution scores.


MATRAS: a program for protein 3D structure comparison
A web server for comparing protein 3D structures using the program Matras, which employs the progressive alignment algorithm, in which pairwise 3D alignments are assembled in the proper order.
Protein structure alignment by incremental combinatorial extension (CE) of the optimal path.
A new algorithm is reported which builds an alignment between two protein structures. The algorithm involves a combinatorial extension (CE) of an alignment path defined by aligned fragment pairs
Protein structure alignment using a genetic algorithm
A novel, fully automatic method for aligning the three‐dimensional structures of two proteins, which has applied to proteins from several well‐studied families: globins, immunoglobulins, serine proteases, dihydrofolate reductases, and DNA methyltransferases.
Hierarchical Protein Structure Superposition Using Both Secondary Structure and Atomic Representations
A new algorithm for the comparison of proteins based on a hierarchy of structural representations, from the secondary structure level to the atomic level is presented, which was able to detect structural similarity at the same level as DALI.
PISCES: a protein sequence culling server
PISCES is a public server for culling sets of protein sequences from the Protein Data Bank by sequence identity and structural quality criteria and provides better lists than servers that use BLAST, which is unable to identify many relationships below 40% sequence identity.
A hidden markov model derived structural alphabet for proteins.
Protein structure comparison
The all-against-all comparison of known protein structures has allowed the structure of “fold-space” to be mapped, from broad general trends to specific clusters of proteins that have the same evolutionary origins.
LGA: a method for finding 3D similarities in protein structures
  • A. Zemla
  • Computer Science, Biology
    Nucleic Acids Res.
  • 2003
Data generated by LGA can be successfully used in a scoring function to rank the level of similarity between two structures and to allow structure classification when many proteins are being analyzed.
Surprising similarities in structure comparison.
Protein structure comparison by alignment of distance matrices.
A novel algorithm (DALI) for optimal pairwise alignment of protein structures that identifies structural resemblances and common structural cores accurately and sensitively, even in the presence of geometrical distortions is developed.