Abstract Background: Mammography is the current diagnostic standard for breast cancer screening and monitoring. However, accessibility challenges, accuracy issues and patient discomfort all contribute to reduced patient compliance and utilization, resulting in a need for more effective diagnostic tools. An artificial intelligence (AI)-based lipidomic blood test may add significant value to early breast cancer detection rate and improve outcomes for patients. We have previously reported a series of lipidomic studies (n=793) and derived a lipid signature from plasma-enriched extracellular vesicles (EVs) that effectively distinguished people with localized breast cancer from cancer-free controls. Here we report the development of a breast cancer detection AI model from lipidomic data assessed directly using plasma samples. Methods: Lipids in both EVs and plasma collected from fasted breast cancer and control blood samples (n=256) were extracted and analysed by liquid chromatography-high resolution mass spectrometry (LC-HRAM-MS). Over 400 manually curated lipids were quantified. A bootstrapped analysis using Boruta, a robust and statistically rigorous feature selection algorithm based on random forest feature importance, was employed to identify cancer discriminatory lipid signatures in EV and plasma lipidomes consistently selected across 2000 bootstrap samples. The resulting lipid signature was then used to train an ensemble of 18 distinct machine learning models for cancer status prediction using a majority vote to aggregate the individual predictions. Model performance and variability were assessed over 2000 iterations of leave-group-out cross-validation (LGOCV) using an 80/20 train-test split. Average patient-level predictions across LGOCV iterations were recorded for both EV-and plasma-derived models and the two modalities were compared using an exact paired samples test (McNemar’s test). Results: Both the EV- and plasma-derived lipid signatures performed well in distinguishing breast cancer samples from controls. The development of a bioinformatics AI pipeline enabled the creation of a robust ensemble model achieving an F1 score of 0.89 in plasma with LGOCV. The final plasma ensemble predictive performance of 86.1% (±4.5%) in accuracy, 91.4% (±5.4%) in sensitivity, and 78.7% (±8.6%) specificity was achieved, which is comparable to that of EV (accuracy: 86.1±4.4%, sensitivity: 90.4±5.3%, specificity: 80.2±8.7%). Paired samples analysis using McNemar’s test indicated no significant differences between models trained on EV- and plasma-derived lipid signatures in either the sensitivity (p=0.65), specificity (p=0.49), or accuracy (p=0.42). Conclusion: The initial study demonstrated the high performance of a plasma-enriched extracellular vesicle-derived lipid biomarker signature for early breast cancer detection. Direct assessment of the lipidomic signature from plasma showed promise in simplifying the test. Assessing plasma directly offered advantages in terms of scalability, higher throughput, and ease of implementation. Further verification of the lipid signature in an upcoming study involving 500 plasma samples is planned. Ongoing studies will further optimize the plasma lipidomic signature and strengthen our AI pipeline. These findings support the potential clinical application of AI-based lipidomic profiling as a blood-based screening tool for breast cancer detection. Citation Format: Ameline Lim, Cheka Kehelpannala, Fatemeh Vafaee, Forrest Koch, Dana Pascovici, Desmond Li, Kerry Heffernan, Gillian Lamoury, Amani Batarseh, Bruce Mann. Development of an Artificial Intelligence-based breast cancer detection model using Plasma Lipidomic Signature [abstract]. In: Proceedings of the 2023 San Antonio Breast Cancer Symposium; 2023 Dec 5-9; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2024;84(9 Suppl):Abstract nr PO4-07-02.
Read full abstract