The Mouse Functional Genome Database (MfunGD): functional annotation of proteins in the light of their cellular context

@article{Ruepp2006TheMF,
  title={The Mouse Functional Genome Database (MfunGD): functional annotation of proteins in the light of their cellular context},
  author={Andreas Ruepp and Octave Noubibou Doudieu and John Van Den Oever and Barbara Brauner and Irmtraud Dunger and Gisela Fobo and Goar Frishman and Corinna Montrone and Christine Skornia and Steffi Wanka and Thomas Rattei and Philipp Pagel and M. Louise Riley and Dmitrij Frishman and Dimitrij Surmeli and Igor V. Tetko and Matthias Oesterheld and Volker St{\"u}mpflen and Hans-Werner Mewes},
  journal={Nucleic Acids Research},
  year={2006},
  volume={34},
  pages={D568 - D571}
}
MfunGD () provides a resource for annotated mouse proteins and their occurrence in protein networks. Manual annotation concentrates on proteins which are found to interact physically with other proteins. Accordingly, manually curated information from a protein–protein interaction database (MPPI) and a database of mammalian protein complexes is interconnected with MfunGD. Protein function annotation is performed using the Functional Catalogue (FunCat) annotation scheme which is widely used for… 

Figures from this paper

Predicting Functions of Proteins in Mouse Based on Weighted Protein-Protein Interaction Network and Protein Hybrid Properties
TLDR
Results indicate that the new approach by hybridizing the PPI information and the biochemical/physicochemical features of protein sequences is quite promising that may open a new avenue or direction for addressing the difficult and complicated problem.
Protein function prediction by collective classification with explicit and implicit edges in protein-protein interaction networks
TLDR
A new method is proposed that combines PPI information and protein sequence information to boost the prediction performance based on collective classification and is significantly better than the compared approaches in sparsely-labeled networks.
Identifying Functions of Proteins in Mice With Functional Embedding Features
TLDR
This study proposed some novel multi-label classifiers, which adopted new embedding features to represent proteins derived from functional domains and a PPI network via word embedding and network embedding, respectively.
Mining literature for systems biology
  • P. Roberts
  • Computer Science
    Briefings Bioinform.
  • 2006
TLDR
These uses of literature, specifically manual curation, derived concepts captured in ontologies and databases, and indirect and direct application of text mining, will be discussed as they pertain to systems biology.
Active learning for protein function prediction in protein-protein interaction networks
Effectively predicting protein functions by collective classification — An extended abstract
TLDR
A novel collective classification based approach that combines protein sequence information and PPI information to improve the prediction performance and validate the robustness of the approach to the number of labeled proteins in PPI networks.
Identification of protein functions in mouse with a label space partition method
TLDR
A new multi-label classifier for identifying functions of mouse proteins was presented and it was found that classifiers with label partition were superior to those without label partition or with random label partition.
Prediction of Deleterious Non-Synonymous SNPs Based on Protein Interaction Network and Hybrid Properties
TLDR
Network features were found to be most important for accurate prediction and can significantly improve the prediction performance, and the results suggest that the protein interaction context could provide important clues to help better illustrate SAP's functional association.
...
...

References

SHOWING 1-10 OF 30 REFERENCES
MIPS: analysis and annotation of proteins from whole genomes in 2005
TLDR
The Munich Information Center for Protein Sequences (MIPS-GSF), Neuherberg, Germany, provides protein sequence-related information based on whole-genome analysis, and maintains automatically generated and manually annotated genome-specific databases and provides tools for the comprehensive analysis of protein sequences.
The Mouse Genome Database (MGD): from genes to mice—a community resource for mouse biology
TLDR
Improvements in MGD discussed here include the enhancement of phenotype resources, the re-development of the International Mouse Strain Resource, IMSR, the update of mammalian orthology datasets and the electronic publication of classic books in mouse genetics.
The PEDANT genome database
TLDR
The current status of the PEDANT database and novel analytical features added to the P EDANT server in 2002 are described, including integration with the BioRS data retrieval system which allows fast text queries and a comprehensive set of tools for genome comparison.
MIPS: analysis and annotation of proteins from whole genomes
TLDR
The Munich Information Center for Protein Sequences (MIPS at the GSF), Neuherberg, Germany, provides resources related to genome information and develops databases covering computable information such as the basic evolutionary relations among all genes.
The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes.
TLDR
The Functional Catalogue (FunCat), a hierarchically structured, organism-independent, flexible and scalable controlled classification system enabling the functional description of proteins from any organism, is presented.
NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins
TLDR
The National Center for Biotechnology Information Reference Sequence (RefSeq) database provides a non-redundant collection of sequences representing genomic data, transcripts and proteins that pragmatically includes sequence data that are currently publicly available in the archival databases.
NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins
TLDR
The format of all RefSeq records is validated, and an increasing number of tests are being applied to evaluate the quality of sequence and annotation, especially in the context of complete genomic sequence.
The Universal Protein Resource (UniProt)
TLDR
During 2004, tens of thousands of Knowledgebase records got manually annotated or updated; the UniProt keyword list got augmented by additional keywords; the documentation of the keywords and are continuously overhauling and standardizing the annotation of post-translational modifications.
The PEDANT genome database in 2005
TLDR
The PEDANT genome database contains pre-computed bioinformatics analyses of publicly available genomes to provide robust automatic annotation of the vast majority of amino acid sequences, which have not been subjected to in-depth manual curation by human experts in high-quality protein sequence databases.
InterPro, progress and status in 2005
TLDR
InterPro release 8.0 contains 11 007 entries, representing 2573 domains, 8166 families, 201 repeats, 26 active sites, 21 binding sites and 20 post-translational modification sites.
...
...