Enhanced protein isoform characterization through long-read proteogenomics

Rachel M Miller,Robert J Millikin,Michael R Shortreed,Simi Kaur,Gloria M Sheynkman,Erin D Jeffery,Simone Tiberi,Lloyd M Smith,Yunxiang Dai,Anne Deslattes Mays,Ana Conesa,Chance John Luckey,Christina Chatzipantsiou,Madison M Mehlferber,Peter J Castaldi,Ben T Jordan

doi:10.1186/s13059-022-02624-y

Abstract

BackgroundThe detection of physiologically relevant protein isoforms encoded by the human genome is critical to biomedicine. Mass spectrometry (MS)-based proteomics is the preeminent method for protein detection, but isoform-resolved proteomic analysis relies on accurate reference databases that match the sample; neither a subset nor a superset database is ideal. Long-read RNA sequencing (e.g., PacBio or Oxford Nanopore) provides full-length transcripts which can be used to predict full-length protein isoforms.ResultsWe describe here a long-read proteogenomics approach for integrating sample-matched long-read RNA-seq and MS-based proteomics data to enhance isoform characterization. We introduce a classification scheme for protein isoforms, discover novel protein isoforms, and present the first protein inference algorithm for the direct incorporation of long-read transcriptome data to enable detection of protein isoforms previously intractable to MS-based detection. We have released an open-source Nextflow pipeline that integrates long-read sequencing in a proteomic workflow for isoform-resolved analysis.ConclusionsOur work suggests that the incorporation of long-read sequencing and proteomic data can facilitate improved characterization of human protein isoform diversity. Our first-generation pipeline provides a strong foundation for future development of long-read proteogenomics and its adoption for both basic and translational research.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Genome biology	Publication Date: Mar 3, 2022
Citations: 44	License type: open-access

R Discovery Prime

R Discovery Prime

Enhanced protein isoform characterization through long-read proteogenomics

Abstract

Talk to us

Similar Papers

More From: Genome biology

Lead the way for us

Similar Papers

Long-read RNA sequencing analysis of the lytic human cytomegalovirus transcriptome
Zsolt Balázs
-
Zsolt BalázsZsolt Balázs
05 Sep 2019
05 Sep 2019

Bridging the splicing gap in human genetics with long-read RNA sequencing: finding the protein isoform drivers of disease.
Peter J Castaldi ... Gloria M Sheynkman
Human molecular genetics | VOL. 31
Peter J Castaldi, et. al.Peter J Castaldi ... Gloria M Sheynkman
12 Aug 2022
Human molecular genetics | VOL. 31

L-RAPiT: A Cloud-Based Computing Pipeline for the Analysis of Long-Read RNA Sequencing Data.
Theodore M Nelson ... Thomas S Postler
International journal of molecular sciences | VOL. 23
Theodore M Nelson, et. al.Theodore M Nelson ... Thomas S Postler
13 Dec 2022
International journal of molecular sciences | VOL. 23

Long-read RNA sequencing reveals widespread sex-specific alternative splicing in threespine stickleback fish.
Alice S Naftaly ... Shana Pau
Genome Research | VOL. 31
Alice S Naftaly, et. al.Alice S Naftaly ... Shana Pau
15 Jun 2021
Genome Research | VOL. 31

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Enhanced protein isoform characterization through long-read proteogenomics

Abstract

Talk to us

Similar Papers

More From: Genome biology