Analysis of loop boundaries using different local structure assignment methods.
Secondary structure is used in hierarchical classification of protein structures, identification of protein features, such as helix caps and loops, for fold recognition, and as a precursor to ab initio structure prediction. There are several methods available for assigning secondary structure if the three-dimensional structure of the protein is known. Unfortunately they differ in their definitions, particularly in the exact positions of the termini. Additionally, most existing methods rely on hydrogen bonding, which means that important secondary structural classes, such as isolated β-strands and poly-proline helices cannot be identified as they do not have characteristic hydrogen-bonding patterns. For this reason we have developed a more accurate method for assigning secondary structure based on main chain geometry, which also allows a more comprehensive assignment of secondary structure. We define secondary structure based on a number of geometric parameters. Helices are defined based on whether they fit inside an imaginary cylinder: residues must be within the correct radius of a central axis. Different types of helices (alpha, 310 or π) are assigned on the basis of the angle between successive peptide bonds. β-strands are assigned based on backbone dihedrals and with alternating peptide bonds. Thus hydrogen bonding is not required and β-strands can be within a parallel sheet, antiparallel sheet, or can be isolated. Poly-proline helices are defined similarly, although with three-fold symmetry. We find that our method better assigns secondary structure than existing methods. Specifically, we find that comparing our methods with those of others, amino-acid trends at helix caps are stronger, secondary structural elements less likely to be concatenated together and secondary structure guided sequence alignment is improved. We conclude, therefore, that secondary structure assignments using our method better reflects physical and evolutionary characteristics of proteins. The program is available from http://www.bioinf.man.ac.uk/~lovell/segno.shtml