LungFlag, a machine-learning (ML) personalized tool for assessing lung cancer risk in a community setting, to evaluate performance in flagging non-small cell lung cancer (NSCLC) regardless of sex or race.

David Morgenstern,Eran Netanel Choman

doi:10.1200/jco.2023.41.16_suppl.10570

Abstract

10570 Background: Application of ML-based risk prediction models to lung cancer screening cohorts have been shown to increase screening efficiency. However, ML-based models may be vulnerable to sexual and racial bias arising from historical bias in health care access as well as biased training data. Demonstrating fairness in the predictions of ML-based models is a prerequisite to their acceptance by clinicians and patients. We assessed the clinical performance of LungFlag based on sex and race, two key demographic subgroups at risk for disparate outcomes due to bias. Methods: The LungFlag machine learning model is a retrospectively validated ML model intended to identify individuals who are at elevated risk for lung cancer and should be counseled regarding lung cancer screening. LungFlag uses existing routine outpatient lab measurements, smoking history, comorbidities, and demographic data to flag high-risk individuals. We compare performance of LungFlag between sexes and between races and calculate the sensitivity at the overall positivity rate of 3%. We chose a case-control design based on a large US-based community and outpatient dataset including 39,135 case patients with NSCLC and 212,454 contemporaneous NSCLC-free controls. We included ever-smokers, ages 45-80, with available lab measurements from 3-12 months before diagnosis, and minimal follow-up of 24 months. Sub-populations with less than 1% representation from the total population were excluded. Results: The comparison between the sub-populations presented by sensitivity and specificity indexes is detailed in the table. No statistically significant difference was demonstrated in the sensitivity of the model to flag individuals that were diagnosed with NSCLC on multiple sub-populations. Conclusions: The LungFlag model demonstrated fairness with respect to sex and race based on similar clinical sensitivity in a large, community-based retrospective dataset. Further assessment in prospective studies and in additional racial sub-populations is recommended to support this conclusion. [Table: see text]

Full Text