Abstract

Despite tremendous efforts in genomics, transcriptomics, and proteomics communities, there is still no comprehensive data about the exact number of protein-coding genes, translated proteoforms, and their function. In addition, by now, we lack functional annotation for 1193 genes, where expression was confirmed at the proteomic level (uPE1 proteins). We re-analyzed results of AP-MS experiments from the BioPlex 2.0 database to predict functions of uPE1 proteins and their splice forms. By building a protein–protein interaction network for 12 ths. identified proteins encoded by 11 ths. genes, we were able to predict Gene Ontology categories for a total of 387 uPE1 genes. We predicted different functions for canonical and alternatively spliced forms for four uPE1 genes. In total, functional differences were revealed for 62 proteoforms encoded by 31 genes. Based on these results, it can be carefully concluded that the dynamics and versatility of the interactome is ensured by changing the dominant splice form. Overall, we propose that analysis of large-scale AP-MS experiments performed for various cell lines and under various conditions is a key to understanding the full potential of genes role in cellular processes.

Highlights

  • Prior to the start of the “Human Genome” international project, it was assumed that our genome contains ca. 100 thousand protein-coding genes (PCGs, [1]), which determine the complexity of the human body

  • Additional information was obtained on the number of translated splice forms, and genes, which were confirmed at the proteomic level, but still have no functional annotation [15,16]

  • We considered only cases where the coincidence of the interactomic profiles amounted to more than 50%

Read more

Summary

Introduction

Prior to the start of the “Human Genome” international project, it was assumed that our genome contains ca. 100 thousand protein-coding genes (PCGs, [1]), which determine the complexity of the human body. Prior to the start of the “Human Genome” international project, it was assumed that our genome contains ca. 100 thousand protein-coding genes (PCGs, [1]), which determine the complexity of the human body. Despite tremendous efforts of the genomics community, the challenge of identifying all human PCGs still confronts us: data in neXtProt and UniProt are being constantly updated [5,6]. The ongoing project “Human Proteome” [7] is aimed to find the answer for a fundamental question: How many protein-coding genes do we have and what are their roles in cellular processes?. Since the start of the Human Proteome Project, significant progress has been made in registering at the proteomic level more than 90% protein-coding genes [8].

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call