Accelerating the search for the missing proteins in the human proteome

  title={Accelerating the search for the missing proteins in the human proteome},
  author={Mark S. Baker and Seong Beom Ahn and Abidali Mohamedali and Mohammad T Islam and David I. Cantor and Peter D E M Verhaert and Susan Fanayan and Samridhi Sharma and Edouard C. Nice and Mark Connor and Shoba Ranganathan},
  journal={Nature Communications},
The Human Proteome Project (HPP) aims to discover high-stringency data for all proteins encoded by the human genome. Currently, ∼18% of the proteins in the human proteome (the missing proteins) do not have high-stringency evidence (for example, mass spectrometry) confirming their existence, while much additional information is available about many of these missing proteins. Here, we present MissingProteinPedia as a community resource to accelerate the discovery and understanding of these… 

Utilization of the Proteome Data Deposited in SRMAtlas for Validating the Existence of the Human Missing Proteins in GPM.

A two-step bioinformatic strategy addressing the utilization of the SRMAtlas synthetic peptides corresponding to the missing proteins as an exclusive reference in order to explore their natural counterparts within GPM was developed.

Looking for Missing Proteins

A high-stringency blueprint of the human proteome

On the occasion of the Human Proteome Project’s tenth anniversary, a 90.4% complete high-stringency human proteome blueprint is reported, highlighting potential roles the human proteomes plays in the understanding, diagnosis and treatment of cancers, cardiovascular and infectious diseases.

Proteogenomics in the context of the Human Proteome Project (HPP)

The evolution of the methodological strategies based on the combination of different omic technologies and the use of huge publicly available datasets is shown taking the Chromosome 16 Consortium as reference.

Enhanced Missing Proteins Detection in NCI60 Cell Lines Using an Integrative Search Engine Approach

All the proteomic experiments from the NCI60 cell lines were used and an integrative approach based on the results obtained from Comet, Mascot, OMSSA, and X!Tandem was applied to increase the proteome coverage, detecting 165 missing proteins with only one unique peptide.

Advances in the Chromosome-Centric Human Proteome Project: looking to the future

If the project moves well by reshaping the original goals, the current working modules and team work in the proposed extended planning period, it is anticipated that a progressively more detailed draft of an accurate chromosome-based proteome map will become available with functional information.

Digging More Missing Proteins Using an Enrichment Approach with ProteoMiner.

These results demonstrated ProteoMiner as a powerful means in discovery of MPs, and proposed that enrichment of low-abundance proteins benefits MPs finding.

The exploration of Missing Proteins by a combination approach to enrich the low abundance hydrophobic proteins from four cancer cell lines.

To improve the identification of low-abundance MPs with high hydrophobicity, a combined two approaches were combined and MPs from several different cancer cell lines were obtained.

Comprehensive characterization of the human proteome by multi-omic analyses

Deep proteomes of 29 healthy human tissues were generated using high-resolution mass spectrometry and integrated analysis with matched transcriptomes revealed unprecedented insights into proteome complexity, relationship between protein and mRNA levels, and protein expression regulation.

Research on The Human Proteome Reaches a Major Milestone: >90% of Predicted Human Proteins Now Credibly Detected, according to the HUPO Human Proteome Project.

According to the 2020 Metrics of the HUPO Human Proteome Project (HPP), expression has now been detected at the protein level for >90% of the 19,773 predicted proteins coded in the human genome. The



neXtProt: organizing protein knowledge in the context of human proteome projects.

How neXtProt contributes to prioritize C-HPP efforts and integrates C-hPP results with other research efforts to create a complete human proteome catalog is described.

Probing the Missing Human Proteome: A Computational Perspective.

It is concluded that using complementary spectral data searches incorporating different parameters like PTMs, against a comprehensive and compact search database, might lead to discoveries of the proteins attributed so far as the missing human proteome.

Metrics for the Human Proteome Project 2015: Progress on the Human Proteome and Guidelines for High-Confidence Protein Identification.

The importance of assessing the quality of evidence, confirming automated findings and considering alternative protein matches for spectra and peptides is discussed and guidelines for proteomics investigators to apply in reporting newly identified proteins are provided.

Prediction of a missing protein expression map in the context of the human proteome project.

An analytical pipeline is described that predicts the probability of a missing protein being expressed in a biological sample based on (1) gene sequence characteristics, (2) the possibility of an expressed gene being a coding gene of aMissing protein in a certain sample, and (3) the likelihood of a gene being expression in a transcriptomic experiment.

Structural Bioinformatics Inspection of neXtProt PE5 Proteins in the Human Proteome.

A systematic examination of the 616 PE5 protein entries suggests that experimental difficulty in identifying membrane-bound proteins and peptides could have precluded their detection in mass spectrometry and that special enrichment techniques with improved sensitivity for membrane proteins could be important for the characterization of the PE5 "dark matter" of the human proteome.

Quest for Missing Proteins: Update 2015 on Chromosome-Centric Human Proteome Project.

New analytical technologies that improve the chemical space and lower detection limits coupled to bioinformatics tools and some publicly available resources that can be used to improve data analysis or support the development of analytical assays are presented.

Computational and Mass-Spectrometry-Based Workflow for the Discovery and Validation of Missing Human Proteins: Application to Chromosomes 2 and 14.

A rigorous step-by-step approach combining bioinformatics screening and MS-based validation assays is particularly suitable to obtain protein-level evidence for proteins previously considered as missing.

Protannotator: a semiautomated pipeline for chromosome-wise functional annotation of the "missing" human proteome.

This work has extended the annotation strategy developed for human chromosome 7 "missing" proteins into a semiautomated pipeline to functionally annotate the " Missing" human proteome, and generated proteotypic peptides in silico to accelerate the identification of " missing" proteins from proteomics studies.

Functional annotation of the human chromosome 7 "missing" proteins: a bioinformatics approach.

A protocol for the functional annotation of these "missing" chromosome 7 proteins is developed by integrating several bioinformatics analysis and annotation tools, sequential BLAST homology searches, protein domain/motif and gene ontology (GO) mapping, and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis.

Metrics for the Human Proteome Project 2016: Progress on Identifying and Characterizing the Human Proteome, Including Post-Translational Modifications.

The HUPO Human Proteome Project (HPP) has two overall goals: (1) stepwise completion of the protein parts list-the draft human proteome including confidently identifying and characterizing at least