CDD: a database of conserved domain alignments with links to domain three-dimensional structure
@article{MarchlerBauer2002CDDAD, title={CDD: a database of conserved domain alignments with links to domain three-dimensional structure}, author={Aron Marchler-Bauer and Anna R. Panchenko and Benjamin A. Shoemaker and Paul A. Thiessen and Lewis Y. Geer and Stephen H. Bryant}, journal={Nucleic acids research}, year={2002}, volume={30 1}, pages={ 281-3 } }
The Conserved Domain Database (CDD) is a compilation of multiple sequence alignments representing protein domains conserved in molecular evolution. It has been populated with alignment data from the public collections Pfam and SMART, as well as with contributions from colleagues at NCBI. The current version of CDD (v.1.54) contains 3693 such models. CDD alignments are linked to protein sequence and structure data in Entrez. The molecular structure viewer Cn3D serves as a tool to interactively…
Figures from this paper
649 Citations
CDD: specific functional annotation with the Conserved Domain Database
- BiologyNucleic Acids Res.
- 2009
NCBI's Conserved Domain Database is a collection of multiple sequence alignments and derived database search models, which represent protein domains conserved in molecular evolution, and provides annotation of domain footprints and conserved functional sites on protein sequences.
CDD: a curated Entrez database of conserved domain alignments
- Computer ScienceNucleic Acids Res.
- 2003
The Conserved Domain Database (CDD), which mirrors the publicly available domain alignment collections SMART and PFAM, and now also contains alignment models curated at NCBI, is now indexed as a separate database within the Entrez system and linked to other Entrez databases such as MEDLINE(R).
CDD: a Conserved Domain Database for protein classification
- BiologyNucleic Acids Res.
- 2005
The progress of the curation effort and associated improvements in the functionality of the CDD information retrieval system are reported on.
CDD: a Conserved Domain Database for the functional annotation of proteins
- BiologyNucleic Acids Res.
- 2011
NCBI’s Conserved Domain Database (CDD) is a resource for the annotation of protein sequences with the location of conserved domain footprints, and functional sites inferred from these footprints. CDD…
CDD/SPARCLE: the conserved domain database in 2020
- Computer ScienceNucleic Acids Res.
- 2020
As NLM's Conserved Domain Database (CDD) enters its 20th year of operations as a publicly available resource, CDD curation staff continues to develop hierarchical classifications of widely…
NCBI's Conserved Domain Database and Tools for Protein Domain Analysis
- BiologyCurrent protocols in bioinformatics
- 2020
The CDD maintains both live search capabilities and an archive of pre‐computed domain annotations for a selected subset of sequences tracked by the NCBI's Entrez protein database, which can be retrieved or computed for a single sequence using CD‐Search or in bulk using Batch CD‐ search, or computed via standalone RPS‐BLAST plus the rpsbproc software package.
CDART: protein homology by domain architecture.
- Computer ScienceGenome research
- 2002
The Conserved Domain Architecture Retrieval Tool (CDART) performs similarity searches of the NCBI Entrez Protein Database based on domain architecture, defined as the sequential order of conserved…
3. Macromolecular Structure Databases
- Biology
- 2003
To enable scientists to accomplish these tasks, NCBI has integrated MMDB and CDD into the Entrez retrieval system and sequences derived from MMDB structures have been included in the BLAST databases.
Annotation of functional sites with the Conserved Domain Database
- BiologyDatabase J. Biol. Databases Curation
- 2012
It is observed that CDD-based site annotation complements existing site annotation in many cases, which may, in part, originate from CDD's curation practice of collecting sites conserved across diverse taxa and supported by evidence from multiple 3D structures.
Improving protein structure similarity searches using domain boundaries based on conserved sequence information
- Computer ScienceBMC Structural Biology
- 2008
Alternative domains, which have significantly different secondary structure composition from those based on structurally compact units, were identified based on the alignment footprints of curated protein sequence domain families and are in the process of inclusion into the VAST search and MMDB resources in the NCBI Entrez system.
References
SHOWING 1-10 OF 15 REFERENCES
Profile analysis: detection of distantly related proteins.
- Biology, Computer ScienceProceedings of the National Academy of Sciences of the United States of America
- 1987
Tests with globin and immunoglobulin sequences show that profile analysis can distinguish all members of these families from all other sequences in a database containing 3800 protein sequences.
Threading with explicit models for evolutionary conservation of structure and sequence
- BiologyProteins
- 1999
It appears that threading with family‐specific models for structure and sequence conservation has improved threading prediction accuracy, and predictions were ranked “first place” by the CASP3 assessor when compared to fold‐recognition predictions made by other methods.
SMART: a web-based tool for the study of genetically mobile domains
- Computer Science, BiologyNucleic Acids Res.
- 2000
SMART (a Simple Modular Architecture Research Tool) allows the identification and annotation of genetically mobile domains and the analysis of domain architectures (http://SMART.embl-heidelberg.de ).…
Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods.
- BiologyJournal of molecular biology
- 1998
The extent to which the SAM-T98 implementation of a hidden Markov model procedure; PSI-BLAST; and the intermediate sequence search (ISS) procedure can detect evolutionary relationships between the members of the sequence database PDBD40-J is determined.
Hidden Markov models for detecting remote protein homologies
- Computer ScienceBioinform.
- 1998
A new hidden Markov model method (SAM-T98) for finding remote homologs of protein sequences is described and evaluated, which is optimized to recognize superfamilies, and would require parameter adjustment to be used to find family or fold relationships.
Embedding strategies for effective use of information from multiple sequence alignments
- Computer Science, BiologyProtein science : a publication of the Protein Society
- 1997
A new strategy for utilizing multiple sequence alignment information to detect distant relationships in searches of sequence databases is described, and PSSM‐embedded queries produced the best results overall when used with a special version of the Smith‐Waterman searching algorithm.
The crystal structure of DNA mismatch repair protein MutS binding to a G·T mismatch
- BiologyNature
- 2000
Mutations in human MutSα (MSH2/MSH6) that lead to hereditary predisposition for cancer, such as hereditary non-polyposis colorectal cancer, can be mapped to this crystal structure.
Profile hidden Markov models
- Computer ScienceBioinform.
- 1998
Profile HMM methods performed comparably to threading methods in the CASP2 structure prediction exercise and complement standard pairwise comparison methods for large-scale sequence analysis.
The Pfam proteins family database
- Nucleic Acids Res.,
- 2000