A novel method for GPCR recognition and family classification from sequence alone using signatures derived from profile hidden Markov models.
G protein-coupled receptors (GPCRs) constitute the largest known family of cell-surface receptors. With hundreds of members populating the rhodopsin-like GPCR superfamily and many more awaiting discovery in the human genome, they are of interest to the pharmaceutical industry because of the opportunities they afford for yielding potentially lucrative drug targets. Typical sequence analysis strategies for identifying novel GPCRs tend to involve similarity searches using standard primary database search tools. This will reveal the most similar sequence, generally without offering any insight into its family or superfamily relationships. Conversely, searches of most 'pattern' or family databases are likely to identify the superfamily, but not the closest matching subtype. Here we describe a diagnostic resource that allows identification of GPCRs in a hierarchical fashion, based principally upon their ligand preference. This resource forms part of the PRINTS database, which now houses approximately 250 GPCR-specific fingerprints (http://www.bioinf.man.ac.uk/dbbrowser/gpcrPRINTS/). This collection of fingerprints is able to provide more sensitive diagnostic opportunities than have been realized by related approaches and is currently the only diagnostic tool for assigning GPCR subtypes. Mapping such fingerprints on to three-dimensional GPCR models offers powerful insights into the structural and functional determinants of subtype specificity.