Abstract

The Human Proteome Project (HPP) aims deciphering the complete map of the human proteome. In the past few years, significant efforts of the HPP teams have been dedicated to the experimental detection of the missing proteins, which lack reliable mass spectrometry evidence of their existence. In this endeavor, an in depth analysis of shotgun experiments might represent a valuable resource to select a biological matrix in design validation experiments. In this work, we used all the proteomic experiments from the NCI60 cell lines and applied an integrative approach based on the results obtained from Comet, Mascot, OMSSA, and X!Tandem. This workflow benefits from the complementarity of these search engines to increase the proteome coverage. Five missing proteins C-HPP guidelines compliant were identified, although further validation is needed. Moreover, 165 missing proteins were detected with only one unique peptide, and their functional analysis supported their participation in cellular pathways as was also proposed in other studies. Finally, we performed a combined analysis of the gene expression levels and the proteomic identifications from the common cell lines between the NCI60 and the CCLE project to suggest alternatives for further validation of missing protein observations.

Highlights

  • Since 2010, the Human Proteome Project (HPP)[1,2] has brought together the efforts of the international research community in the field of proteomics, bioinformatics, and molecular biology to (1) define the complete catalog of human proteins (C-HPP initiative3) and (2) study the functions of proteins in biology and disease (B/D-HPP initiative[4−6])

  • The codes PE2, PE3, and PE4 correspond to missing proteins, while PE1 is the annotation for proteins with strong evidence from mass spectrometry or other experimental methods, and PE5 is the code for uncertain proteins. neXtProt includes the most up-to-date annotation of the human proteome, and other information

  • We developed a bioinformatic workflow (Figure 1) for the detection of missing proteins based on three pillars: (1) the strict application of the C-HPP guidelines for the detection of proteins using MS/MS experiments; (2) the analysis of shotgun experiments of 59 different cell lines using an integrative approach based on four search engines; and (3) the quantification of the expression level of the protein coding genes in these cell lines as a guidance for predicting the suitable sample sources for the targeted proteomic validation experiments

Read more

Summary

Introduction

Since 2010, the Human Proteome Project (HPP)[1,2] has brought together the efforts of the international research community in the field of proteomics, bioinformatics, and molecular biology to (1) define the complete catalog of human proteins (C-HPP initiative3) and (2) study the functions of proteins in biology and disease (B/D-HPP initiative[4−6]). In terms of the human proteome characterization, the main objective is the detection of the proteins without sufficient experimental evidence using mass-spectrometry, known as the “missing proteins” or “missing proteome”.13. Www.nextprot.org) has been consolidated as the key resource for the evaluation of the C-HPP initiative advances in the description of the human proteome. In this database, different experimental evidence categories are assigned to each protein. The codes PE2 (experimental evidence at transcript level), PE3 (protein inferred from homology), and PE4 (predicted protein) correspond to missing proteins, while PE1 is the annotation for proteins with strong evidence from mass spectrometry or other experimental methods, and PE5 is the code for uncertain proteins. The codes PE2 (experimental evidence at transcript level), PE3 (protein inferred from homology), and PE4 (predicted protein) correspond to missing proteins, while PE1 is the annotation for proteins with strong evidence from mass spectrometry or other experimental methods, and PE5 is the code for uncertain proteins. neXtProt includes the most up-to-date annotation of the human proteome, and other information

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call