Abstract Background: Single-cell technologies provide invaluable insights into disease biology and drug development by revealing complex interactions among different cell types within patients. However, harnessing the potential of publicly available single-cell data remains challenging due to the lack of integrated data across diverse single cell platforms. Methods: To maximize the potential of single cell insights, we have created an AI/ML powered curation and data integration process within our Drug Intelligence Science (DIS®) platform. This automated process integrates single cell transcriptomic data from publicly available sources with our in-house generated datasets from ex vivo translational efforts and clinical programs. Relevant public datasets are identified and retrieved using an automated Python-based GEO dataset crawler tool, and standardized metadata are curated. A reference atlas of 31 immune and 4 non-immune cell types is generated from the database and is used to consistently label cells across datasets. Additionally, cell-level gene expression integration and normalization across datasets is achieved using generative modeling techniques. To facilitate multiuser visualization and interrogation of the curated and integrated datasets, we have developed Cellect, a customized version of CellxGene Gateway. Results: The current single-cell database comprises 15 million cells analyzed in >5,000 samples collected from >2,500 patients across 160 curated unique human studies, spanning oncology (56%), autoimmune disease (AID, 21%), and viral infection (15%), with the remaining being from reference healthy tissues (8%). We focused dataset curation on indications of interest for our internal clinical development efforts. Uniquely, data from approximately 300 patients who received standard of care (SOC) treatment including immune checkpoint inhibitor (IO) therapy were integrated to delineate mechanisms of drug response and resistance. The post-SOC resource was used to identify (1) treatment settings where these mechanisms are enriched and (2) markers co-expressed with targets of interest (e.g., PD1, CTLA4, OX40, TNFR2, etc.) in specific cell populations to inform combination strategies. Conclusions: We have presented an AI/ML guided approach to address the key challenge of integrating single-cell data across platforms and demonstrated that relevant disease biology is retained upon integration. We outline a path for deploying this solution at scale for bench and computational scientists to guide target as well as indication selection, as was done for our ongoing clinical programs, including our first-in-class TNFR2 agonist (HFB200301, NCT05238883) and second-generation OX40 agonist (HFB301001, NCT05229601). Citation Format: Jack Russella-Pollard, Joshua Whitener, Jordan Byck, Monika Manne, Hani Alostaz, Marianna Elia, Kamil Krukowski, Gabrielle Wong, Francisco Adrian, Liang Schweizer, Robert H. Andtbacka, Christos Hatzis. Integrating public single-cell transcriptomics and patient profiles to guide clinical development [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 6202.
Read full abstract