Learn More
A novel algorithm has been developed for scoring the match between an imprecise sparse signature and all the protein sequences in a sequence database. The method was applied to a specific problem: signatures were derived from the probable folding nucleus and positions obtained from the determined interactions that occur during the folding of three small(More)
We extend the concept of the motif as a tool for characterizing protein families and explore the feasibility of a sparse "motif" that is the length of the protein sequence itself. The type of motif discussed is a sparse family signature consisting of a set of N key residue positions (A1, A2...AN) preceded by gaps (G) thus G1A1G2A2. ...GNAN. Both a residue(More)
We identified key residues from the structural alignment of families of protein domains from SCOP which we represented in the form of sparse protein signatures. A signature-generating algorithm (SigGen) was developed and used to automatically identify key residues based on several structural and sequence-based criteria. The capacity of the signatures to(More)
UNLABELLED Receiver operating characteristic (ROC) analysis is a powerful and widely used technique for assessing predictive methods, yet there are no generic, open-source software tools for this that are freely available. Our ROCPLOT program performs ROC analysis on one or more files of search results (hits) and generates the following: (i) ROC values,(More)
The protein databank contains coordinates of over 10,000 protein structures, which constitute more than 25,000 structural domains in total. The investigation of protein structural, functional and evolutionary relationships is fundamental to many important fields in bioinformatics research, and will be crucial in determining the function of the human and(More)
  • 1