Abstract

Genomics-based neoantigen discovery can be enhanced by proteomic evidence, but there remains a lack of consensus on the performance of different quality control methods for variant peptide identification in proteogenomics. We propose to use the difference between accurately predicted and observed retention times for each peptide as a metric to evaluate different quality control methods. To this end, we develop AutoRT, a deep learning algorithm with high accuracy in retention time prediction. Analysis of three cancer data sets with a total of 287 tumor samples using different quality control strategies results in substantially different numbers of identified variant peptides and putative neoantigens. Our systematic evaluation, using the proposed retention time metric, provides insights and practical guidance on the selection of quality control strategies. We implement the recommended strategy in a computational workflow named NeoFlow to support proteogenomics-based neoantigen prioritization, enabling more sensitive discovery of putative neoantigens.

Highlights

  • Genomics-based neoantigen discovery can be enhanced by proteomic evidence, but there remains a lack of consensus on the performance of different quality control methods for variant peptide identification in proteogenomics

  • This is achieved by simultaneously performing whole-exome sequencing (WES), RNA sequencing (RNA-Seq), and tandem mass spectrometry (MS/MS)-based shotgun proteomics analysis on matched samples, producing customized, sample-specific protein databases from DNA, and/or RNA sequencing data, and searching

  • For the tandem mass tag (TMT) and isobaric tags for relative and absolute quantification (iTRAQ) studies, we built one customized database for each TMT or iTRAQ experiment based on WES data of all individual tumor samples in the TMT

Read more

Summary

Introduction

Genomics-based neoantigen discovery can be enhanced by proteomic evidence, but there remains a lack of consensus on the performance of different quality control methods for variant peptide identification in proteogenomics. 1234567890():,; Proteogenomics has become a routine approach for the detection of protein sequences, resulting from genomic aberrations such as single nucleotide variants (SNVs), insertions and deletions (INDELs), RNA editing, novel junctions, gene fusions, and novel transcription regions[1,2,3]. This is achieved by simultaneously performing whole-exome sequencing (WES), RNA sequencing (RNA-Seq), and tandem mass spectrometry (MS/MS)-based shotgun proteomics analysis on matched samples, producing customized, sample-specific protein databases from DNA, and/or RNA sequencing data, and searching. Because MHC binds peptides rather than RNA molecules, validation of mutated alleles through proteomic profiling will likely provide more functionally and clinically relevant neoantigens for prioritization

Objectives
Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.