Abstract

Rationale: Chronic obstructive pulmonary disease (COPD) is a heterogeneous syndrome with phenotypic manifestations that tend to be distributed along a continuum. Unsupervised machine learning based on broad selection of imaging and clinical phenotypes may be used to identify primary variables that define disease axes and stratify patients with COPD. Objectives: To identify primary variables driving COPD heterogeneity using principal component analysis and to define disease axes and assess the prognostic value of these axes across three outcomes: progression, exacerbation, and mortality. Methods: We included 7,331 patients between 39 and 85 years old, of whom 40.3% were Black and 45.8% were female smokers with a mean of 44.6 pack-years, from the COPDGene (Genetic Epidemiology of COPD) phase I cohort (2008-2011) in our analysis. Out of a total of 916 phenotypes, 147 continuous clinical, spirometric, and computed tomography (CT) features were selected. For each principal component (PC), we computed a PC score based on feature weights. We used PC score distributions to define disease axes along which we divided the patients into quartiles. To assess the prognostic value of these axes, we applied logistic regression analyses to estimate 5-year (n = 4,159) and 10-year (n = 1,487) odds of progression. Cox regression and Kaplan-Meier analyses were performed to estimate 5-year and 10-year risk of exacerbation (n = 6,532) and all-cause mortality (n = 7,331). Results: The first PC, accounting for 43.7% of variance, was defined by CT measures of air trapping and emphysema. The second PC, accounting for 13.7% of variance, was defined by spirometric and CT measures of vital capacity and lung volume. The third PC, accounting for 7.9% of the variance, was defined by CT measures of lung mass, airway thickening, and body habitus. Stratification of patients across each disease axis revealed up to 3.2-fold (95% confidence interval [CI] 2.4, 4.3) greater odds of 5-year progression, 5.4-fold (95% CI 4.6, 6.3) greater risk of 5-year exacerbation, and 5.0-fold (95% CI 4.2, 6.0) greater risk of 10-year mortality between the highest and lowest quartiles. Conclusions: Unsupervised learning analysis of the COPDGene cohort reveals that CT measurements may bolster patient stratification along the continuum of COPD phenotypes. Each of the disease axes also individually demonstrate prognostic potential, predictive of future forced expiratory volume in 1 second decline, exacerbation, and mortality.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call