Abstract Accurate and comprehensive interpretation of genomic variants has become a bottleneck in clinical sequencing applications due to the accelerating precision oncology and biomedical information explosion. This motivated us to build Ephesus, a framework enabling curation of predictive, prognostic and diagnostic evidence for clinical biomarkers in cancers. Currently Ephesus is the primary content source for Roche navify Mutation Profiler (nMP). The Ephesus’ data model ensures adherence to best practices in clinical genomic content curation. Variant classification follows the AMP guidelines for somatic variant interpretation. Variant interpretations are derived from international regulatory approvals and clinical practice guidelines (FDA, EMA, TGA, eVIQ, etc.) and recommendations (NCCN, ESMO). The data model is mapped to a set of relational tables. To enforce data integrity and validity, strategies such as data normalization based on standard nomenclatural ontologies, a submit-review-approval workflow, and biological constraints are adopted. Ephesus is deployed as a web application used by curators inside and outside Roche. The UI allows users to conduct edits, filtering, sorting and bulk operations on entities by attributes. In order to minimize manual effort and maximize content coverage, inference rules are applied before ingesting into nMP. Currently Ephesus is developed for data collection, browsing and summarizing primary entities such as biomarkers, evidence items, genes, variants, variant groups, and drugs. The content from the 2023-Aug snapshot contains 11,596 directly curated biomarker profiles (including biomarker combinations), and 6M+ expanded profiles for 41 major cancer types. To evaluate the clinical reporting value of the curated knowledge, ~160k real cancer patient samples of the AACR GENIE project were queried on Ephesus. Compared with three other major knowledgebases, Ephesus/NMP is observed showing the highest performance. Table 1. Percentage of patients or biomarkers with at least one interpretation from each knowledgebase CGI (2022-10-17) CIVIC(2023-09-01) ClinVar(2023-09-09) NMP(2023-08-11) AACR-Genie-v14.0 Total patient % with any interpretation 25% (40,545) 46% (73,835) 81% (130,882) 89% (142,905) 160,965 biomarker % (including combos) with any interpretation 0.03% (256) 0.06% (526) 13.26% (122,424) 73.79% (681,128) 923,093 Citation Format: Jian Li, Lili Niu, Shuba krishna, Fnu Kinshuk, Martin Jones, Carlos Hernandez, Michael Clark, Sandra Balladares. Genomics annotation and interpretation in somatic oncology using structured data from a clinical knowledgebase [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 2351.
Read full abstract