Abstract

Secreted proteins (SPs) play important roles in diverse important biological processes; however, a comprehensive and high-quality list of human SPs is still lacking. Here we identified 6,943 high-confidence human SPs (3,522 of them are novel) based on 330,427 human proteins derived from databases of UniProt, Ensembl, AceView, and RefSeq. Notably, 6,267 of 6,943 (90.3%) SPs have the supporting evidences from a large amount of mass spectrometry (MS) and RNA-seq data. We found that the SPs were broadly expressed in diverse tissues as well as human body fluid, and a significant portion of them exhibited tissue-specific expression. Moreover, 14 cancer-specific SPs that their expression levels were significantly associated with the patients’ survival of eight different tumors were identified, which could be potential prognostic biomarkers. Strikingly, 89.21% of 6,943 SPs (2,927 novel SPs) contain known protein domains. Those novel SPs we mainly enriched with the known domains regarding immunity, such as Immunoglobulin V-set and C1-set domain. Specifically, we constructed a user-friendly and freely accessible database, SPRomeDB (www.unimd.org/SPRomeDB), to catalog those SPs. Our comprehensive SP identification and characterization gain insights into human secretome and provide valuable resource for future researches.

Highlights

  • The secretome of an organism represents the proteins released by all types of cells/tissues of this organism (Chua et al, 2012)

  • We systematically explored human Secretory Proteins (SPs) based on the non-redundant proteins integrated from UniProt, Ensembl, AceView, and RefSeq databases

  • Our stringent criteria missed a number of gold standard secreted proteins (GSSPs), our identified SPs were with high-confidence and only 8 of them (0.1%) were overlapped with Gold standard non-secreted proteins (GSNPs)

Read more

Summary

Introduction

The secretome of an organism represents the proteins released by all types of cells/tissues of this organism (Chua et al, 2012). Many SPs have been identified as important biomarkers for diverse cancers, and some of them could be therapeutic targets (Schaaij-Visser et al, 2013). The strategies for identifying SPs can be mainly grouped into two different categories: proteomic identification and genome-based computational prediction (Hathout, 2007). The improvement of high-throughput liquid chromatographycoupled tandem mass spectrometry (LC-MS/MS) has allowed the identification of over 1000 proteins in a single experiment (Schaaij-Visser et al, 2013; Ichibangase and Imai, 2014; Li et al, 2017; Zhang et al, 2018), which empowers proteomic approach to be the mainstay in SP identification. The exploration of human secretome at both transcriptome and proteome levels is still lacking, and the functions of SPs are largely unknown

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call