Proteomics in non-human primates: utilizing RNA-Seq data to improve protein identification by mass spectrometry in vervet monkeys

J Michael Proffitt,Laura A Cox,Kylie Kavanagh,Michael R Shortreed,Jeremy Glenn,Anthony J Cesnik,Michael Olivier,Lloyd M Smith,Avinash Jadhav

doi:10.1186/s12864-017-4279-0

Abstract

BackgroundShotgun proteomics utilizes a database search strategy to compare detected mass spectra to a library of theoretical spectra derived from reference genome information. As such, the robustness of proteomics results is contingent upon the completeness and accuracy of the gene annotation in the reference genome. For animal models of disease where genomic annotation is incomplete, such as non-human primates, proteogenomic methods can improve the detection of proteins by incorporating transcriptional data from RNA-Seq to improve proteomics search databases used for peptide spectral matching. Customized search databases derived from RNA-Seq data are capable of identifying unannotated genetic and splice variants while simultaneously reducing the number of comparisons to only those transcripts actively expressed in the tissue.ResultsWe collected RNA-Seq and proteomic data from 10 vervet monkey liver samples and used the RNA-Seq data to curate sample-specific search databases which were analyzed in the program Morpheus. We compared these results against those from a search database generated from the reference vervet genome. A total of 284 previously unannotated splice junctions were predicted by the RNA-Seq data, 92 of which were confirmed by peptide spectral matches. More than half (53/92) of these unannotated splice variants had orthologs in other non-human primates, suggesting that failure to match these peptides in the reference analyses likely arose from incomplete gene model information. The sample-specific databases also identified 101 unique peptides containing single amino acid substitutions which were missed by the reference database. Because the sample-specific searches were restricted to actively expressed transcripts, the search databases were smaller, more computationally efficient, and identified more peptides at the empirically derived 1 % false discovery rate.ConclusionProteogenomic approaches are ideally suited to facilitate the discovery and annotation of proteins in less widely studies animal models such as non-human primates. We expect that these approaches will help to improve existing genome annotations of non-human primate species such as vervet.

Highlights

Shotgun proteomics utilizes a database search strategy to compare detected mass spectra to a library of theoretical spectra derived from reference genome information
Search databases curated from RNA-Seq data are smaller and computationally more efficient than reference genome databases To demonstrate the utility of RNA-Seq derived proteomics search databases, we created sample-specific databases (SSdb) for each liver sample from 10 different vervet monkeys based on sequenced Messenger ribonucleic acid (mRNA) extracted from the same tissue sample as the protein being analyzed by mass spectrometry (MS)
We found no relationship between RNA-Seq read depth and peptide spectral matches (PSMs) or unique peptides identified in the samples when searched by the SSdb

Summary

Introduction

Shotgun proteomics utilizes a database search strategy to compare detected mass spectra to a library of theoretical spectra derived from reference genome information. For animal models of disease where genomic annotation is incomplete, such as non-human primates, proteogenomic methods can improve the detection of proteins by incorporating transcriptional data from RNA-Seq to improve proteomics search databases used for peptide spectral matching. Shotgun proteomic approaches employ a database search strategy to compare experimentally observed mass spectra to an in silico-generated library of theoretical spectra derived from gene annotation information of the organism(s) being studied. Proteomic studies of genetically well-characterized species such as mice and humans benefit from robust proteomic search databases and extensive genome annotations which can account for known genetic variability such as splice variants and sequence variation altering the amino acid sequence of encoded proteins. Protein identification of other research model organisms is limited by the quality of reference genome annotations

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Genomics	Publication Date: Nov 13, 2017
Citations: 17	License type: open-access

R Discovery Prime

R Discovery Prime

Proteomics in non-human primates: utilizing RNA-Seq data to improve protein identification by mass spectrometry in vervet monkeys

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genomics

Lead the way for us

Similar Papers

Protein Identification Using Customized Protein Sequence Databases Derived from RNA-Seq Data
Xiaojing Wang ... Daniel C Liebler
Journal of Proteome Research | VOL. 11
Xiaojing Wang, et. al.Xiaojing Wang ... Daniel C Liebler
14 Dec 2011
Journal of Proteome Research | VOL. 11

Prediction and Quantification of Splice Events from RNA-Seq Data.
Leonard D Goldstein ... Gregoire Pau
PLOS ONE | VOL. 11
Leonard D Goldstein, et. al.Leonard D Goldstein ... Gregoire Pau
24 May 2016
PLOS ONE | VOL. 11

Complementarity of assembly-first and mapping-first approaches for alternative splicing annotation and differential analysis from RNAseq data
Clara Benoit-Pilven ... Didier Auboeuf
Scientific Reports | VOL. 8
Clara Benoit-Pilven, et. al.Clara Benoit-Pilven ... Didier Auboeuf
09 Mar 2018
Scientific Reports | VOL. 8

Full-Featured, Real-Time Database Searching Platform Enables Fast and Accurate Multiplexed Quantitative Proteomics.
Devin K Schweppe ... Edward L Huttlin
Journal of Proteome Research | VOL. 19
Devin K Schweppe, et. al.Devin K Schweppe ... Edward L Huttlin
03 Mar 2020
Journal of Proteome Research | VOL. 19

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Proteomics in non-human primates: utilizing RNA-Seq data to improve protein identification by mass spectrometry in vervet monkeys

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genomics