CDD: a database of conserved domain alignments with links to domain three-dimensional structure

  title={CDD: a database of conserved domain alignments with links to domain three-dimensional structure},
  author={Aron Marchler-Bauer and Anna R. Panchenko and Benjamin A. Shoemaker and Paul A. Thiessen and Lewis Y. Geer and Stephen H. Bryant},
  journal={Nucleic acids research},
  volume={30 1},
The Conserved Domain Database (CDD) is a compilation of multiple sequence alignments representing protein domains conserved in molecular evolution. It has been populated with alignment data from the public collections Pfam and SMART, as well as with contributions from colleagues at NCBI. The current version of CDD (v.1.54) contains 3693 such models. CDD alignments are linked to protein sequence and structure data in Entrez. The molecular structure viewer Cn3D serves as a tool to interactively… 

Figures from this paper

CDD: specific functional annotation with the Conserved Domain Database
NCBI's Conserved Domain Database is a collection of multiple sequence alignments and derived database search models, which represent protein domains conserved in molecular evolution, and provides annotation of domain footprints and conserved functional sites on protein sequences.
CDD: a curated Entrez database of conserved domain alignments
The Conserved Domain Database (CDD), which mirrors the publicly available domain alignment collections SMART and PFAM, and now also contains alignment models curated at NCBI, is now indexed as a separate database within the Entrez system and linked to other Entrez databases such as MEDLINE(R).
CDD: a Conserved Domain Database for protein classification
The progress of the curation effort and associated improvements in the functionality of the CDD information retrieval system are reported on.
CDD: a Conserved Domain Database for the functional annotation of proteins
NCBI’s Conserved Domain Database (CDD) is a resource for the annotation of protein sequences with the location of conserved domain footprints, and functional sites inferred from these footprints. CDD
CDD/SPARCLE: the conserved domain database in 2020
As NLM's Conserved Domain Database (CDD) enters its 20th year of operations as a publicly available resource, CDD curation staff continues to develop hierarchical classifications of widely
NCBI's Conserved Domain Database and Tools for Protein Domain Analysis
The CDD maintains both live search capabilities and an archive of pre‐computed domain annotations for a selected subset of sequences tracked by the NCBI's Entrez protein database, which can be retrieved or computed for a single sequence using CD‐Search or in bulk using Batch CD‐ search, or computed via standalone RPS‐BLAST plus the rpsbproc software package.
CDART: protein homology by domain architecture.
The Conserved Domain Architecture Retrieval Tool (CDART) performs similarity searches of the NCBI Entrez Protein Database based on domain architecture, defined as the sequential order of conserved
3. Macromolecular Structure Databases
To enable scientists to accomplish these tasks, NCBI has integrated MMDB and CDD into the Entrez retrieval system and sequences derived from MMDB structures have been included in the BLAST databases.
Annotation of functional sites with the Conserved Domain Database
It is observed that CDD-based site annotation complements existing site annotation in many cases, which may, in part, originate from CDD's curation practice of collecting sites conserved across diverse taxa and supported by evidence from multiple 3D structures.
Improving protein structure similarity searches using domain boundaries based on conserved sequence information
Alternative domains, which have significantly different secondary structure composition from those based on structurally compact units, were identified based on the alignment footprints of curated protein sequence domain families and are in the process of inclusion into the VAST search and MMDB resources in the NCBI Entrez system.


Profile analysis: detection of distantly related proteins.
Tests with globin and immunoglobulin sequences show that profile analysis can distinguish all members of these families from all other sequences in a database containing 3800 protein sequences.
Threading with explicit models for evolutionary conservation of structure and sequence
It appears that threading with family‐specific models for structure and sequence conservation has improved threading prediction accuracy, and predictions were ranked “first place” by the CASP3 assessor when compared to fold‐recognition predictions made by other methods.
SMART: a web-based tool for the study of genetically mobile domains
SMART (a Simple Modular Architecture Research Tool) allows the identification and annotation of genetically mobile domains and the analysis of domain architectures ( ).
Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods.
The extent to which the SAM-T98 implementation of a hidden Markov model procedure; PSI-BLAST; and the intermediate sequence search (ISS) procedure can detect evolutionary relationships between the members of the sequence database PDBD40-J is determined.
Hidden Markov models for detecting remote protein homologies
A new hidden Markov model method (SAM-T98) for finding remote homologs of protein sequences is described and evaluated, which is optimized to recognize superfamilies, and would require parameter adjustment to be used to find family or fold relationships.
Embedding strategies for effective use of information from multiple sequence alignments
A new strategy for utilizing multiple sequence alignment information to detect distant relationships in searches of sequence databases is described, and PSSM‐embedded queries produced the best results overall when used with a special version of the Smith‐Waterman searching algorithm.
Cn3D: sequence and structure views for Entrez.
The crystal structure of DNA mismatch repair protein MutS binding to a G·T mismatch
Mutations in human MutSα (MSH2/MSH6) that lead to hereditary predisposition for cancer, such as hereditary non-polyposis colorectal cancer, can be mapped to this crystal structure.
Profile hidden Markov models
  • S. Eddy
  • Computer Science
  • 1998
Profile HMM methods performed comparably to threading methods in the CASP2 structure prediction exercise and complement standard pairwise comparison methods for large-scale sequence analysis.
The Pfam proteins family database
  • Nucleic Acids Res.,
  • 2000