Abstract
Effective identification of detailed biochemical features at the protein level has the potential to boost genome-based Virology and Vaccinology, by enhancing detection of viruses’ weak spots. Precise mapping of such sensitive sites, also known as the epitopes identification problem, constitutes a grand computational challenge: solving it may lead to an improved vaccine efficacy against existing as well as future viral strains. We consider the open-ended challenge of discovering unmapped epitopes. We introduce a spatial-linguistic approach, in which the relevant proteins’ sequences are considered as alphabetical strings with a linked, known 3D structure. The explicit aim is to correctly parse them into meaningful tokens/subsets with respect to the chemical space, to be associated with the virus’ epitopes. The problem is then translated into seeking an effective tokenizer/word-splitter, adhering to prescribed biochemical criteria and satisfying spatial constraints, whose output specifies subsequences that accurately map the epitopes. We devise three competing procedures, namely two chemistry-driven, parameterized heuristics as well as a constrained Subset Selection problem-solver. The heuristics require tuning, which we conduct by Evolution Strategies, whereas we employ a Genetic Algorithm as the latter problem-solver. Empirical results are presented herein for detecting epitopes of influenza A(H3N2), exhibiting an improvement with respect to the baseline reference when applied to experimental Hemagglutination Inhibition data. Finally, we discuss the potential of the proposed approach with respect to the recent SARS-CoV-2 coronavirus, in the effort to further develop and improve effective vaccines and fight the COVID-19 pandemic.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.