Accelerating the search for the missing proteins in the human proteome

  title={Accelerating the search for the missing proteins in the human proteome},
  author={Mark S. Baker and Seong Beom Ahn and Abidali Mohamedali and Mohammad T Islam and David I. Cantor and Peter D E M Verhaert and Susan Fanayan and Samridhi Sharma and Edouard C. Nice and Mark Connor and Shoba Ranganathan},
  journal={Nature Communications},
The Human Proteome Project (HPP) aims to discover high-stringency data for all proteins encoded by the human genome. Currently, ∼18% of the proteins in the human proteome (the missing proteins) do not have high-stringency evidence (for example, mass spectrometry) confirming their existence, while much additional information is available about many of these missing proteins. Here, we present MissingProteinPedia as a community resource to accelerate the discovery and understanding of these… 

Looking for Missing Proteins

A high-stringency blueprint of the human proteome

On the occasion of the Human Proteome Project’s tenth anniversary, a 90.4% complete high-stringency human proteome blueprint is reported, highlighting potential roles the human proteomes plays in the understanding, diagnosis and treatment of cancers, cardiovascular and infectious diseases.

Proteogenomics in the context of the Human Proteome Project (HPP)

The evolution of the methodological strategies based on the combination of different omic technologies and the use of huge publicly available datasets is shown taking the Chromosome 16 Consortium as reference.

Digging More Missing Proteins Using an Enrichment Approach with ProteoMiner.

These results demonstrated ProteoMiner as a powerful means in discovery of MPs, and proposed that enrichment of low-abundance proteins benefits MPs finding.

The exploration of Missing Proteins by a combination approach to enrich the low abundance hydrophobic proteins from four cancer cell lines.

To improve the identification of low-abundance MPs with high hydrophobicity, a combined two approaches were combined and MPs from several different cancer cell lines were obtained.

Comprehensive characterization of the human proteome by multi-omic analyses

Deep proteomes of 29 healthy human tissues were generated using high-resolution mass spectrometry and integrated analysis with matched transcriptomes revealed unprecedented insights into proteome complexity, relationship between protein and mRNA levels, and protein expression regulation.

Research on The Human Proteome Reaches a Major Milestone: >90% of Predicted Human Proteins Now Credibly Detected, according to the HUPO Human Proteome Project.

According to the 2020 Metrics of the HUPO Human Proteome Project (HPP), expression has now been detected at the protein level for >90% of the 19,773 predicted proteins coded in the human genome. The

Progress on the HUPO Draft Human Proteome: 2017 Metrics of the Human Proteome Project.

The Human Proteome Organization (HUPO) Human Proteome Project (HPP) continues to make progress on its two overall goals: (1) completing the protein parts list, with an annual update of the HUPO draft

Proteogenomic Analysis to Identify Missing Proteins from Haploid Cell Lines

This study demonstrates that expression proteomics coupled to proteogenomic analysis can be employed to identify many annotated and unannotated missing proteins.

Decoding the Effect of Isobaric Substitutions on Identifying Missing Proteins and Variant Peptides in Human Proteome.

A thorough comparative analysis on data sets in PeptideAtlas Tiered Human Integrated Search Proteome is conducted to systematically investigate the possibility of unique peptide in missing proteins (PE2-4), unique peptides in dubious proteins, and variant peptides affected by isobaric substitutions, causing doubtful identification results.



neXtProt: organizing protein knowledge in the context of human proteome projects.

How neXtProt contributes to prioritize C-HPP efforts and integrates C-hPP results with other research efforts to create a complete human proteome catalog is described.

Probing the Missing Human Proteome: A Computational Perspective.

It is concluded that using complementary spectral data searches incorporating different parameters like PTMs, against a comprehensive and compact search database, might lead to discoveries of the proteins attributed so far as the missing human proteome.

Metrics for the Human Proteome Project 2015: Progress on the Human Proteome and Guidelines for High-Confidence Protein Identification.

The importance of assessing the quality of evidence, confirming automated findings and considering alternative protein matches for spectra and peptides is discussed and guidelines for proteomics investigators to apply in reporting newly identified proteins are provided.

Prediction of a missing protein expression map in the context of the human proteome project.

An analytical pipeline is described that predicts the probability of a missing protein being expressed in a biological sample based on (1) gene sequence characteristics, (2) the possibility of an expressed gene being a coding gene of aMissing protein in a certain sample, and (3) the likelihood of a gene being expression in a transcriptomic experiment.

Structural Bioinformatics Inspection of neXtProt PE5 Proteins in the Human Proteome.

A systematic examination of the 616 PE5 protein entries suggests that experimental difficulty in identifying membrane-bound proteins and peptides could have precluded their detection in mass spectrometry and that special enrichment techniques with improved sensitivity for membrane proteins could be important for the characterization of the PE5 "dark matter" of the human proteome.

Quest for Missing Proteins: Update 2015 on Chromosome-Centric Human Proteome Project.

New analytical technologies that improve the chemical space and lower detection limits coupled to bioinformatics tools and some publicly available resources that can be used to improve data analysis or support the development of analytical assays are presented.

Computational and Mass-Spectrometry-Based Workflow for the Discovery and Validation of Missing Human Proteins: Application to Chromosomes 2 and 14.

A rigorous step-by-step approach combining bioinformatics screening and MS-based validation assays is particularly suitable to obtain protein-level evidence for proteins previously considered as missing.

Special Enrichment Strategies Greatly Increase the Efficiency of Missing Proteins Identification from Regular Proteome Samples.

It is shown that special enrichment strategies can break through the data saturation bottleneck, which could increase the efficiency of MP identification in future C-HPP studies.

The Human Proteome Project: Current State and Future Direction

The Human Proteome Organization urges each national research funding agency and the scientific community at large to identify their preferred pathways to participate in aspects of this highly promising project in a HPP consortium of funders and investigators.

Protannotator: a semiautomated pipeline for chromosome-wise functional annotation of the "missing" human proteome.

This work has extended the annotation strategy developed for human chromosome 7 "missing" proteins into a semiautomated pipeline to functionally annotate the " Missing" human proteome, and generated proteotypic peptides in silico to accelerate the identification of " missing" proteins from proteomics studies.