Abstract

SESSION TITLE: Use of Machine Learning and Artificial IntelligenceSESSION TYPE: Original InvestigationsPRESENTED ON: 10/16/2022 10:30 am - 11:30 amPURPOSE: To develop a machine learning (ML) model to predict the patient’s “lung age” (LA) as determined by their pulmonary function test (PFT). METHODS: An extensive database of PFTs obtained from two Mayo Clinic sites over 20 years was used to derive and validate ML models to predict the patient’s LA. For this study, patients were considered eligible if they were 18 years of age or older, had appropriate research authorization, had no history of pulmonary disease or conditions, had no history of smoking or tobacco use, were not underweight or obese, and had normal PFTs defined by ATS and Gold criteria. Only each patient’s first PFT was considered. This resulted in 4,938 patients who were randomly split into training (80%; N=3,964) and validation (20%; N=974) cohorts. Two gradient boosting machines (GBM) were trained and optimized based on a grid search across hypermeters (e.g., learning rate, interaction depth, number of trees, and sampling rate). The first model utilized traditional discrete data from spirometry, lung volume, and diffusion capacity. A second model also included time-series features extracted from continuous forced vital capacity expiratory spirometry curves. Final models were selected based on the lowest resulting root mean square error (RMSE) averaged over 5-fold cross-validation. Model explainability was evaluated through variable importance and SHAP plots. Model performance was then compared to a previously published linear model for lung age utilizing height, sex, and FEV1. RESULTS: The training and validation cohort had a median age of 57.9 (range: 18.0, 94.2), median height of 1.7 meters (range: 1.4-2.0), were 90.4% Caucasian, 7.3% African American, and 60.4% female. The GBM model trained on traditional discrete PFT values had an RMSE of 7.4 in the validation cohort. Performance was improved to 7.1 when including features extracted from continuous flow-volume curves. Both models outperformed previously published models (RMSE = 16.2 in this validation cohort). Model performance was comparable across sex and race, however, all models performed better in patients 40 years of age or older. CONCLUSIONS: Machine learning models could accurately predict lung age from PFT data with substantial performance improvements over currently published standards. CLINICAL IMPLICATIONS: Lung age is more practical and readily understandable to patients, as demonstrated in a prior smoking cessation study. Additionally, because the lung age incorporates a composite of multiple variables from the PFT, rather than a single physiologic variable like the FEV1, it can be a more sensitive marker of disease for clinical trials and clinical practice. DISCLOSURES: No relevant relationships by Rodrigo Cartin-CebaNo relevant relationships by Scott HelgesonNo relevant relationships by Patrick JohnsonNo relevant relationships by Augustine LeeNo relevant relationships by Kaiser LimNo relevant relationships by Alexander NivenNo relevant relationships by Victor OrtegaNo relevant relationships by Daniel PoliszukNo relevant relationships by Zachary Quicksall SESSION TITLE: Use of Machine Learning and Artificial Intelligence SESSION TYPE: Original Investigations PRESENTED ON: 10/16/2022 10:30 am - 11:30 am PURPOSE: To develop a machine learning (ML) model to predict the patient’s “lung age” (LA) as determined by their pulmonary function test (PFT). METHODS: An extensive database of PFTs obtained from two Mayo Clinic sites over 20 years was used to derive and validate ML models to predict the patient’s LA. For this study, patients were considered eligible if they were 18 years of age or older, had appropriate research authorization, had no history of pulmonary disease or conditions, had no history of smoking or tobacco use, were not underweight or obese, and had normal PFTs defined by ATS and Gold criteria. Only each patient’s first PFT was considered. This resulted in 4,938 patients who were randomly split into training (80%; N=3,964) and validation (20%; N=974) cohorts. Two gradient boosting machines (GBM) were trained and optimized based on a grid search across hypermeters (e.g., learning rate, interaction depth, number of trees, and sampling rate). The first model utilized traditional discrete data from spirometry, lung volume, and diffusion capacity. A second model also included time-series features extracted from continuous forced vital capacity expiratory spirometry curves. Final models were selected based on the lowest resulting root mean square error (RMSE) averaged over 5-fold cross-validation. Model explainability was evaluated through variable importance and SHAP plots. Model performance was then compared to a previously published linear model for lung age utilizing height, sex, and FEV1. RESULTS: The training and validation cohort had a median age of 57.9 (range: 18.0, 94.2), median height of 1.7 meters (range: 1.4-2.0), were 90.4% Caucasian, 7.3% African American, and 60.4% female. The GBM model trained on traditional discrete PFT values had an RMSE of 7.4 in the validation cohort. Performance was improved to 7.1 when including features extracted from continuous flow-volume curves. Both models outperformed previously published models (RMSE = 16.2 in this validation cohort). Model performance was comparable across sex and race, however, all models performed better in patients 40 years of age or older. CONCLUSIONS: Machine learning models could accurately predict lung age from PFT data with substantial performance improvements over currently published standards. CLINICAL IMPLICATIONS: Lung age is more practical and readily understandable to patients, as demonstrated in a prior smoking cessation study. Additionally, because the lung age incorporates a composite of multiple variables from the PFT, rather than a single physiologic variable like the FEV1, it can be a more sensitive marker of disease for clinical trials and clinical practice. DISCLOSURES: No relevant relationships by Rodrigo Cartin-Ceba No relevant relationships by Scott Helgeson No relevant relationships by Patrick Johnson No relevant relationships by Augustine Lee No relevant relationships by Kaiser Lim No relevant relationships by Alexander Niven No relevant relationships by Victor Ortega No relevant relationships by Daniel Poliszuk No relevant relationships by Zachary Quicksall

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call