Large Database Systems Research Articles

Abstract Purpose of Study: In this study we demonstrate accurate prediction of the impact of somatic mutations on the HLA presentation landscape achieved by interrogating a large scale database of 1.4 million unique HLA peptide sequences that have been directly identified by mass spectrometry. Background: Peptides presented to the immune system on HLA complexes are valuable targets for immunotherapeutic treatments. Identifying the full complement of peptides derived from a particular protein that are presented on major class I HLA restrictions will provide a vital step toward increasing the speed and viability of many immunotherapeutic strategies. Advances in next-generation sequencing (NGS) and single-cell technologies have enabled the accurate capture of somatic mutations accumulated by a tumor, yet a significant hurdle remains how this information can be utilized for immunotherapeutic benefit. In particular, identifying which somatic mutations produce neoantigens (peptides that contain a somatic mutation and are presented to the immune system in complex with HLA) is crucial to linking genetic changes with immunologic impact. Materials and Methods: Our approach to understanding the targetable human HLA peptidome is based on three key principles: achieving full proteome coverage, maximising individual protein coverage, and focusing on dominant HLA restrictions. By integrating novel cell biology, mass spectrometry, and bioinformatic technologies across over 1,000 individual experiments we have dramatically increased the depth of the HLA ligandome captured and achieved near total coverage of the protein-coding genome. Over 90% of the proteome has been captured for the restriction HLA-A*02:01, dominant in Caucasian populations. Our comprehensive genome coverage has enabled us to probe both directly and indirectly for the presence of neoantigens. Known somatic mutations within immortalized lines were used to generate bespoke reference databases that has led to direct identification of many hundreds of neoantigens. Results: Proteins that were found to contain neoantigens appeared to follow the same pattern of antigen processing and presentation as their unmutated equivalents. We have therefore found our HLA peptide dataset is able to offer significant value in predicting the likelihood of a somatic mutation creating a neoantigen. To test this, somatic mutations reported in 980 cell lines were probed against the database of HLA peptides. On average we find one peptide containing the mutated amino acid for every five somatic mutations reported. By incorporating the HLA background of the cell carrying the mutation, we narrow this prediction to one high-affinity HLA peptide for every fourteen somatic mutations reported. Comparing the peptides predicted in this analysis with those directly identified by mass spectrometry, we are able to show that we can prioritize mutation data by accurately predicting the presence and relative abundance of neoantigens. Our neoantigen prediction process is fully incorporated into a large scale database system, enabling us to seamlessly integrate NGS data from individual tissue and use peptidomic data to rapidly define the targetable landscape of an individual. Conclusions: An integrative approach to HLA peptidomics has delivered a powerful reference database for developing novel immunotherapies. Citation Format: Alex S. Powlesland, Geert P.M. Mommen, Ricardo J. Carreira, Jacob Hurst, Michael J. Cundell, David Lowne, Floriana Capuano, Bent K. Jakobsen. Exploiting large-scale HLA peptidomics to generate novel immunotherapies: A data-driven approach to true neoantigen prioritization [abstract]. In: Proceedings of the Fourth CRI-CIMT-EATI-AACR International Cancer Immunotherapy Conference: Translating Science into Survival; Sept 30-Oct 3, 2018; New York, NY. Philadelphia (PA): AACR; Cancer Immunol Res 2019;7(2 Suppl):Abstract nr B086.

Read full abstract

We developed an attribute-shuffling obfuscation for database applications in cloud environment and studied its potential in preventing information leaks to malicious administrators at cloud providers, who have unlimited and non-censored accesses to any local resources, possibly including the system security logs. The proposed obfuscation allows database management systems at cloud servers to perform fundamental query operations while cloud users' information is protected against leaks to malicious administrators. We studied the performance of the proposed obfuscation to find that the inflation of obfuscated tables at cloud servers and the increase in the network traffic load will be the major overhead. We developed algorithms that mitigate the inflation of obfuscated tables using 'α' parameter and the increase in the network traffic load using the query constructor. The former achieved a linear relation between the obfuscated table size and the degree of obfuscation how hard for malicious administrators to understand the meaning of users' information. The latter achieved that the network traffic load nearly converged to that of no obfuscation for busy systems. We conclude that the proposed attribute-shuffling obfuscation will be feasible and efficient for busy and large database systems, while it adapts to database systems with diverse configurations. Copyright © 2015 John Wiley & Sons, Ltd.

Read full abstract

Large Database Systems Research Articles

Related Topics

Articles published on Large Database Systems

The design of library database management system based on MySQL

A large scale training sample database system for intelligent interpretation of remote sensing imagery

Artificial intelligence dissociative identity disorder (AIDIS): the dark side of ChatGPT

Analysis on NSAW Reminder Based on Big Data Technology

A Comprehensive Study on Code Coverage Analysis for effective Test Development/Enhancement Methodology

Application of Big Data Information Platform in Medical Equipment

Abstract B086: Exploiting large-scale HLA peptidomics to generate novel immunotherapies: A data-driven approach to true neoantigen prioritization

Small Quadrotor Plant Protection UAV System Based on Big Data

Prospective registry database of patients with malignant mesothelioma: directions for a future Japanese registry-based lung cancer study.

A Generic Scheme of plaintext-checkable database encryption

Designs, analyses, and optimizations for attribute-shuffling obfuscation to protect information from malicious cloud administrators

Multimedia Mining Research – An Overview

A Vehicle Monitoring System Based on STeCEQL

Managing and analysing brain data through use of digital atlasing tools and infrastructures

Towards Big Data to Improve Availability of Massive Database

A Range Query Method using Index in Large-scale Database Systems

Content-based Image Retrieval by Information Theoretic Measure

Tree and Hashing Data Structures to Speed up Chemical Searches: Analysis and Experiments.

Accusation of Salami publication: the new bane of large database investigations? Young investigators beware!

Content similarity matching for video sequence identification

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Large Database Systems Research Articles

Related Topics

Articles published on Large Database Systems

The design of library database management system based on MySQL

A large scale training sample database system for intelligent interpretation of remote sensing imagery

Artificial intelligence dissociative identity disorder (AIDIS): the dark side of ChatGPT

Analysis on NSAW Reminder Based on Big Data Technology

A Comprehensive Study on Code Coverage Analysis for effective Test Development/Enhancement Methodology

Application of Big Data Information Platform in Medical Equipment

Abstract B086: Exploiting large-scale HLA peptidomics to generate novel immunotherapies: A data-driven approach to true neoantigen prioritization

Small Quadrotor Plant Protection UAV System Based on Big Data

Prospective registry database of patients with malignant mesothelioma: directions for a future Japanese registry-based lung cancer study.

A Generic Scheme of plaintext-checkable database encryption

Designs, analyses, and optimizations for attribute-shuffling obfuscation to protect information from malicious cloud administrators

Multimedia Mining Research – An Overview

A Vehicle Monitoring System Based on STeCEQL

Managing and analysing brain data through use of digital atlasing tools and infrastructures

Towards Big Data to Improve Availability of Massive Database

A Range Query Method using Index in Large-scale Database Systems

Content-based Image Retrieval by Information Theoretic Measure

Tree and Hashing Data Structures to Speed up Chemical Searches: Analysis and Experiments.

Accusation of Salami publication: the new bane of large database investigations? Young investigators beware!

Content similarity matching for video sequence identification