Rapid and sensitive protein similarity searches.

@article{Lipman1985RapidAS,
  title={Rapid and sensitive protein similarity searches.},
  author={David J. Lipman and William R. Pearson},
  journal={Science},
  year={1985},
  volume={227 4693},
  pages={
          1435-41
        }
}
An algorithm was developed which facilitates the search for similarities between newly determined amino acid sequences and sequences already available in databases. Because of the algorithm's efficiency on many microcomputers, sensitive protein database searches may now become a routine procedure for molecular biologists. The method efficiently identifies regions of similar sequence and then scores the aligned identical and differing residues in those regions by means of an amino acid… 
Automat and BLAST: comparison of two protein sequence similarity search programs
TLDR
The differences and similarities in their basic principles, their use and their performances are analysed in this paper in order to allow optimal use of these important softwares.
A Space-Efficient Approach towards Distantly Homologous Protein Similarity Searches
TLDR
This paper proposes a heuristic pair-wise sequence alignment algorithm that is sufficiently fast to be applicable to database searches for short query sequences, has constant auxiliary space requirements, produces good alignments, and is sensitive enough to return even distantly related protein chains that might be of interest.
Database Searching with DNA and Protein Sequences: An Introduction
TLDR
This review of sequence database searching aims to set out current practice in the area, and describes the basic principles behind the programs and enumerates the range of databases available in the public domain.
BLAST and FASTA similarity searching for multiple sequence alignment.
  • W. Pearson
  • Biology
    Methods in molecular biology
  • 2014
TLDR
Both BLAST and FASTA provide very accurate statistical estimates, which can be used to reliably identify protein sequences that diverged more than 2 billion years ago.
SALSA: improved protein database searching by a new algorithm for assembly of sequence fragments into gapped alignments
TLDR
A new algorithm has been devised for the computation of a gapped alignment of two sequences using dynamic programming to build an accurate alignment based on the fragments initially identified.
The Protein Identification Resource (PIR): An On-Line Computer System for the Characterization of Proteins Based on Comparisons with Previously Characterized Protein Sequences
TLDR
This conference is dedicated to examining new methods for the isolation and characterization of proteins, and the primary structures of well over 3, 000 proteins containing almost three quarters of a million residues are now known.
Indexing and Retrieval for Genomic Databases
TLDR
It is shown experimentally that the indexed approach results in significant savings in computationally intensive local alignments and that index-based searching is as accurate as existing exhaustive search schemes.
FLASH: a fast look-up algorithm for string homology
  • A. Califano, I. Rigoutsos
  • Computer Science
    Proceedings of IEEE Conference on Computer Vision and Pattern Recognition
  • 1993
TLDR
The algorithm presented is based on a probabilistic indexing framework which requires minimal access to the database for each match, and is shown to scale well to databases containing billions of nucleotides with performances that are orders of magnitude better than the fastest of the current techniques.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 27 REFERENCES
Rapid similarity searches of nucleic acid and protein data banks.
  • W. Wilbur, D. Lipman
  • Biology, Computer Science
    Proceedings of the National Academy of Sciences of the United States of America
  • 1983
TLDR
An algorithm for the global comparison of sequences based on matching k-tuples of sequence elements for a fixed k results in substantial reduction in the time required to search a data bank when compared with prior techniques of similarity analysis, with minimal loss in sensitivity.
Efficient algorithms for folding and comparing nucleic acid sequences
TLDR
The homology and secondary structure programs are respectively illustrated with a comparison of two phage genomes, and a discussion of Drosophila melanogaster 55 RNA folding.
New approaches for computer analysis of nucleic acid sequences.
A new high-speed computer algorithm is outlined that ascertains within and between nucleic acid and protein sequences all direct repeats, dyad symmetries, and other structural relationships. Large
A test for nucleotide sequence homology.
Identification of common molecular subsequences.
THE CONTEXT DEPENDENT COMPARISON OF BIOLOGICAL SEQUENCES
TLDR
A dynamic programming algorithm is given for the calculation of a context dependent similarity score between sequences and conditions are described which allow the conversion of the similarity score to a metric distance.
Enhanced graphic matrix analysis of nucleic acid and protein sequences.
  • J. Maizel, R. Lenk
  • Biology
    Proceedings of the National Academy of Sciences of the United States of America
  • 1981
TLDR
Computer translation of nucleic acid sequences into all possible amino acid sequences followed by graphic matrix analysis provides a way to detect the most likely protein encoding regions and can predict the correct reading frames in sequences in which splicing patterns are not defined.
Pattern recognition in nucleic acid sequences. I. A general method for finding local homologies and symmetries
TLDR
An algorithm is presented--a generalization of the Needleman-Wunsch-Sellers algorithm--which finds within longer sequences all subsequences that resemble one another locally locally.
On the statistical significance of nucleic acid similarities
TLDR
It is demonstrated that the known statistical properties of nucleic acid sequences strongly affect the statistical distribution of similarity values when calculated by standard procedures and a series of models are proposed which account for some of theseknown statistical properties.
...
1
2
3
...