Abstract

Simple SummaryPeptides expressed on the cell surface can be used to distinguish between diseased and healthy cells and for precision drug targeting. Ideal targets in cancer diagnostics and therapeutic development are the result of altered peptide sequences that make it to the surface as expressed neoantigens. Identifying these peptides requires both genomics and proteomics sequencing technologies, which makes the process both expensive and challenging. We present an alternative solution where cloud computing can be used to improve and simplify current approaches.Unique peptide neo-antigens presented on the cell surface are attractive targets for researchers in nearly all areas of personalized medicine. Cells presenting peptides with mutated or other non-canonical sequences can be utilized for both targeted therapies and diagnostics. Today’s state-of-the-art pipelines utilize complementary proteogenomic approaches where RNA or ribosomal sequencing data helps to create libraries from which tandem mass spectrometry data can be compared. In this study, we present an alternative approach whereby cloud computing is utilized to power neo-antigen searches against community curated databases containing more than 7 million human sequence variants. Using these expansive databases of high-quality sequences as a reference, we reanalyze the original data from two previously reported studies to identify neo-antigen targets in metastatic melanoma. Using our approach, we identify 79 percent of the non-canonical peptides reported by previous genomic analyses of these files. Furthermore, we report 18-fold more non-canonical peptides than previously reported. The novel neo-antigens we report herein can be corroborated by secondary analyses such as high predicted binding affinity, when analyzed by well-established tools such as NetMHC. Finally, we report 738 non-canonical peptides shared by at least five patient samples, and 3258 shared across the two studies. This illustrates the depth of data that is present, but typically missed by lower statistical power proteogenomic approaches. This large list of shared peptides across the two studies, their annotation, non-canonical origin, as well as MS/MS spectra from the two studies are made available on a web portal for community analysis.

Highlights

  • To address the challenges of false discovery rate estimation, we describe an alternative approach utilizing a peptide shuffling decoy database that does not suffer from score over-estimation problems that are demonstrated in the standard reverse protein strategies used in proteomics

  • We demonstrate the efficacy of this approach by reanalyzing two publicly available melanoma human leukocyte antigen (HLA) datasets and comparing to commonly used proteomics algorithms, MaxQuant, Sequest, Comet, and MS-GF+ [33,34,35,36]

  • We present the largest collection of high-quality HLA peptide sequences to date and a proof of concept for the utilization of cloud computing resources and community curated sequence libraries to enable immunopeptidomics

Read more

Summary

Introduction

One of the most promising cancer immunotherapy options targets molecular entities that are expressed by tumor cells that are lacking in normal cells [1,2,3,4]. The most common form of such entities are short peptides presented on the cell surface bound to human leukocyte antigen (HLA) molecules. Mutated neo-antigens, when expressed and presented on the cell surface, are attractive targets for immune checkpoint blockade therapies as well as clinical diagnostics [2]. It is well established that the loss of HLA heterozygosity (LOH) is a common occurrence in metastasis.

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call