Abstract

An increasing number of studies involve integrative analysis of gene and protein expression data, taking advantage of new technologies such as next-generation transcriptome sequencing and highly sensitive mass spectrometry (MS) instrumentation. Recently, a strategy, termed ribosome profiling (or RIBO-seq), based on deep sequencing of ribosome-protected mRNA fragments, indirectly monitoring protein synthesis, has been described. We devised a proteogenomic approach constructing a custom protein sequence search space, built from both Swiss-Prot- and RIBO-seq-derived translation products, applicable for MS/MS spectrum identification. To record the impact of using the constructed deep proteome database, we performed two alternative MS-based proteomic strategies as follows: (i) a regular shotgun proteomic and (ii) an N-terminal combined fractional diagonal chromatography (COFRADIC) approach. Although the former technique gives an overall assessment on the protein and peptide level, the latter technique, specifically enabling the isolation of N-terminal peptides, is very appropriate in validating the RIBO-seq-derived (alternative) translation initiation site profile. We demonstrate that this proteogenomic approach increases the overall protein identification rate 2.5% (e.g. new protein products, new protein splice variants, single nucleotide polymorphism variant proteins, and N-terminally extended forms of known proteins) as compared with only searching UniProtKB-SwissProt. Furthermore, using this custom database, identification of N-terminal COFRADIC data resulted in detection of 16 alternative start sites giving rise to N-terminally extended protein variants besides the identification of four translated upstream ORFs. Notably, the characterization of these new translation products revealed the use of multiple near-cognate (non-AUG) start codons. As deep sequencing techniques are becoming more standard, less expensive, and widespread, we anticipate that mRNA sequencing and especially custom-tailored RIBO-seq will become indispensable in the MS-based protein or peptide identification process. The underlying mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium with the dataset identifier PXD000124.

Highlights

  • From the ‡Laboratory of Bioinformatics and Computational Genomics, Department of Mathematical Modelling, Statistics and Bioinformatics, Faculty of Bioscience Engineering, Ghent University, B-9000, Ghent, the ࿣Stem Cell Institute Leuven, Department of Development and Regeneration, Catholic University, Leuven, B-3000 Leuven, the ‡‡Department of Medical Protein Research, Flemish Institute for Biotechnology, B-9000 Ghent, and the §§Department of Biochemistry, Ghent University, B-9000 Ghent, Belgium

  • To record the impact of using the constructed deep proteome database, we performed two types of proteome analysis as follows: (i) a regular shotgun proteomic and (ii) an N-terminal COFRADIC approach. The former gives an overall assessment on the protein and peptide level, the latter, by enriching for N-terminal peptides, is highly suited for validating the RIBO-seq translation initiation site observations [15]

  • Shotgun Proteomics—Using the custom combined database as search space, the number of protein identifications increases with 2.64% as compared with searching the UniProtKB-SwissProt reference set only

Read more

Summary

Introduction

After MS/MS spectra acquisition, protein sequence database searching (Mascot [3], X!Tandem [4], and OMSSA [5], among others) is used for peptide identification. Integration of RIBO-seq Information in MS-based Proteomics the real protein pool of a specific sample or even be allinclusive. A new strategy, termed ribosome profiling (or RIBO-seq), based on deep sequencing of ribosome-protected mRNA fragments, monitoring protein synthesis, has been described [13, 14]. Ribosome profiling is more suitable than mRNA-seq to delineate the exact ORFs and derive protein sequences, which are highly informative, to create a custom sequence search space for MS/MS-based peptide identification. For more than 65% of the annotated proteins, more than one translation initiation site was determined [15]

Objectives
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call