Abstract

In the postgenome era, biologists have sought to measure the complete complement of proteins, termed proteomics. Currently, the most effective method to measure the proteome is with shotgun, or bottom-up, proteomics, in which the proteome is digested into peptides that are identified followed by protein inference. Despite continuous improvements to all steps of the shotgun proteomics workflow, observed proteome coverage is often low; some proteins are identified by a single peptide sequence. Complete proteome sequence coverage would allow comprehensive characterization of RNA splicing variants and all posttranslational modifications, which would drastically improve the accuracy of biological models. There are many reasons for the sequence coverage deficit, but ultimately peptide length determines sequence observability. Peptides that are too short are lost because they match many protein sequences and their true origin is ambiguous. The maximum observable peptide length is determined by several analytical challenges. This paper explores computationally how peptide lengths produced from several common proteome digestion methods limit observable proteome coverage. Iterative proteome cleavage strategies are also explored. These simulations reveal that maximized proteome coverage can be achieved by use of an iterative digestion protocol involving multiple proteases and chemical cleavages that theoretically allow 92.9% proteome coverage.

Highlights

  • In the postgenome era, biologists have sought system-wide measurements of RNA, proteins, and, metabolites, termed transcriptomics, proteomics, and metabolomics, respectively

  • The ability to cover 100% of protein sequences in a biological system was likened to surrealism in a recent review by Meyer et al [2]

  • Proteome fragmentation is generally accomplished by targeting one or more amino acid residues for cleavage, and, the protein cleavage can be likened to a Poisson process that produces an exponential distribution of peptide lengths

Read more

Summary

Introduction

Biologists have sought system-wide measurements of RNA, proteins, and, metabolites, termed transcriptomics, proteomics, and metabolomics, respectively. Observed protein sequence coverage is often low. Multiple steps in the traditional shotgun proteomics workflow contribute to the deficit in observed sequence coverage, including proteome isolation, proteome digestion, peptide separation, peptide MS/MS, and identification by peptide-spectrum matching. Several types of peptide separation have been explored [5,6,7]. Peptide-spectrum matching algorithms are adapting to new data types [11] and becoming more sensitive [12, 13]. Proteome fragmentation into sequenceable peptides is one step with significant room for improvement. Proteome fragmentation is generally accomplished by targeting one or more amino acid residues for cleavage, and, the protein cleavage can be likened to a Poisson process that produces an exponential distribution of peptide lengths

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call