Reece Hart

Learn More
We present an efficient algorithm to systematically and automatically identify patterns in protein sequence families. The procedure is based on the Splash deterministic pattern discovery algorithm and on a framework to assess the statistical significance of patterns. We demonstrate its application to the fully automated discovery of patterns in 974 PROSITE(More)
PROSITE is a method for protein classification which relies on a database of biologically significant sites and patterns in protein sequences. Most patterns in PROSITE have been gathered by a a labor intensive combination of experimental characterization of functional residues and sequence alignment. In this paper we present a new and efficient supervised(More)
  • 1