An explainable artificial intelligence framework for risk prediction of COPD in smokers
BackgroundSince the inconspicuous nature of early signs associated with Chronic Obstructive Pulmonary Disease (COPD), individuals often remain unidentified, leading to suboptimal opportunities for timely prevention and treatment. The purpose of this study was to create an explainable artificial intelligence framework combining data preprocessing methods, machine learning methods, and model interpretability methods to identify people at high risk of COPD in the smoking population and to provide a reasonable interpretation of model predictions.MethodsThe data comprised questionnaire information, physical examination data and results of pulmonary function tests before and after bronchodilatation. First, the factorial analysis for mixed data (FAMD), Boruta and NRSBoundary-SMOTE resampling methods were used to solve the missing data, high dimensionality and category imbalance problems. Then, seven classification models (CatBoost, NGBoost, XGBoost, LightGBM, random forest, SVM and logistic regression) were applied to model the risk level, and the best machine learning (ML) model’s decisions were explained using the Shapley additive explanations (SHAP) method and partial dependence plot (PDP).ResultsIn the smoking population, age and 14 other variables were significant factors for predicting COPD. The CatBoost, random forest, and logistic regression models performed reasonably well in unbalanced datasets. CatBoost with NRSBoundary-SMOTE had the best classification performance in balanced datasets when composite indicators (the AUC, F1-score, and G-mean) were used as model comparison criteria. Age, COPD Assessment Test (CAT) score, gross annual income, body mass index (BMI), systolic blood pressure (SBP), diastolic blood pressure (DBP), anhelation, respiratory disease, central obesity, use of polluting fuel for household heating, region, use of polluting fuel for household cooking, and wheezing were important factors for predicting COPD in the smoking population.ConclusionThis study combined feature screening methods, unbalanced data processing methods, and advanced machine learning methods to enable early identification of COPD risk groups in the smoking population. COPD risk factors in the smoking population were identified using SHAP and PDP, with the goal of providing theoretical support for targeted screening strategies and smoking population self-management strategies.
- # Chronic Obstructive Pulmonary Disease
- # Shapley Additive Explanations
- # Partial Dependence Plot
- # Smoking Population
- # High Risk Of Chronic Obstructive Pulmonary Disease
- # Chronic Obstructive Pulmonary Disease Assessment Test
- # Chronic Obstructive Pulmonary Disease Risk Factors
- # Chronic Obstructive Pulmonary Disease In Smokers
- # Chronic Obstructive Pulmonary Disease Risk
- # Results Of Pulmonary Function Tests
- Research Article
13
- 10.1016/j.cyto.2019.154881
- Oct 16, 2019
- Cytokine
Increased levels of inflammatory biomarker CX3CL1 in patients with chronic obstructive pulmonary disease
- Research Article
1
- 10.4103/jpbs.jpbs_302_19
- Nov 1, 2020
- Journal of Pharmacy & Bioallied Sciences
ABSTRACTIntroduction:Chronic obstruction pulmonary disease (COPD) is a chronic airflow disorder along with decreasing health status. COPD assessment test (CAT) is commonly used to assess the health status of patients and their medical results. The aim of this study was to assess the therapeutic outcomes in patients with COPD using CAT in private hospitals in Yogyakarta.Materials and Methods:This was a cross-sectional study involving 156 patients, aged >40 years who had completed the CAT questionnaire. CAT scores were categorized into four groups and consisted of eight items: cough, phlegm, chest tightness, breathlessness going up hills/stairs, activity limitations at home, confidence leaving home, sleep, and energy. The four categories were successful therapy (CAT scores <10), moderately successful CAT 10–19), less successful (CAT scores 20–30), and unsuccessful (CAT score >30). The study was conducted from April to August 2018 at two Private Hospitals in Yogyakarta followed by descriptive-analytical data processing and chi-square analysis.Results:The therapeutic outcomes of COPD were 30.13% successful (CAT score: <10), 60.26% moderately successful (CAT score: 10–19), 9.62% less successful (CAT score: 20–30), and there were no patients with unsuccessful therapy. The majority of patients had moderate airflow severity. Exacerbation condition, severity level, and type of therapy showed a significant result (P < 0.05) toward therapy results with COPD measurement, and from eight CAT items, it was identified that 37.8% of respondents had breathlessness going up hills/stairs.Conclusion:CAT can assess the therapeutic outcomes and COPD patient’s health status with moderately successful therapy (CAT score 10–19) in more than sixty percent of respondents.
- Research Article
- 10.1164/ajrccm.2025.211.abstracts.a4955
- May 1, 2025
- American Journal of Respiratory and Critical Care Medicine
RATIONALE: Chronic obstructive pulmonary disease (COPD) is traditionally associated with a history of smoking; however, up to 30% of the global COPD burden occurs in never smokers. Given the increasing global burden of COPD, there is urgent interest in developing a broader understanding of “non-traditional” COPD risk factors, to inform development of novel therapeutics and preventative strategies. This study aimed to identify risk factors for never-smoking-related COPD in Ontario, Canada in a real-world population cohort. METHODS: We conducted a longitudinal, population-based cohort study using provincial health administrative databases linked to survey data in Ontario, Canada. We included all individuals above the age of 35 years residing in Ontario between 2000/2001 to 2018/2019 with survey-recorded smoking information. Individuals with COPD were identified using a validated case definition. Multivariable logistic regression models were constructed to evaluate associations for COPD stratified by smoking status. Analyses were performed using SAS (SAS Institute, Cary, North Carolina, USA). RESULTS: The total cohort comprised 151,290 individuals, with 48.4% characterized as never smokers. The prevalence of COPD among never smokers was 9.7%, constituting 25.1% of all identified COPD cases. Never smokers with COPD were diagnosed at an older age (median 72, IQR 61-81) compared to smokers with COPD (median 64, IQR 55-73) and were predominantly female (67.1%). History of asthma was found to have the highest association with risk of incident never-smoking COPD (aOR 3.95, 95% CI 3.69-4.23). Incident COPD in never smokers also had greater association with high comorbidity rate (aOR 2.73, 95% CI 1.96-3.79) and previous respiratory disease (aOR 2.3 95% CI 1.95-2.7). Environmental tobacco smoke was a significant risk factor for COPD, with greater association in smokers (aOR 1.86, 95% CI 1.77-1.95) versus never smokers (aOR 1.2, 95% CI 1.04-1.37). Increased BMI &gt; 30 kg/m2 was found to be a risk factor for COPD only in never smokers (aOR 1.32, 95% CI 1.24-1.41). Of air pollution measures, only increasing annual average PM2.5 exposure was associated with incident COPD and was similar between never smokers and smokers (aOR 1.08, 95% CI 1.03-1.13 versus aOR 1.05, 95% CI 1.02-1.08, respectively). Female sex (aOR 0.86, 95% CI 0.81-0.91) and increasing educational attainment were found to be protective of incident never-smoking-related COPD. CONCLUSIONS: As smoking rates in high-income countries decline, the relative importance of never smoking COPD will increase. This uniquely large cohort highlights the substantial burden of and identifies independent risk factors for incident COPD in never smokers.
- Abstract
- 10.1136/thoraxjnl-2012-202678.387
- Nov 19, 2012
- Thorax
BackgroundThe COPD (chronic obstructive pulmonary disease) assessment test (CAT) is a recently introduced, simple to use health status instrument, which takes less time to complete than better-established health status instruments...
- Abstract
- 10.1016/j.chest.2019.08.1533
- Oct 1, 2019
- Chest
COPD EXACERBATION RATE BY BASELINE COPD ASSESSMENT TEST SCORE IN THE DYNAGITO STUDY
- Research Article
- 10.3724/sp.j.1008.2013.00839
- Nov 28, 2013
- Academic Journal of Second Military Medical University
Objective To observe the correlation between chronic obstructive pulmonary disease(COPD)assessment test (CAT)score and prognostic factors,so as to investigate the value of CAT score in predicting the prognosis of COPD. Methods A total of 81patients with newly diagnosed COPD in our hospital during Jul.2011to Sep.2012,without using inhaled corticosteroid(ICS)/long-actingβ2agonist(LABA)or long-acting antimuscarinic agent(LAMA),were divided into group A (low risk,less symptoms),B (low risk,more symptoms),C (high risk,less symptoms)and D (high risk,more symptoms)groups according to Global Initiative for Chronic Obstructive Lung Disease(GOLD,2011edition),and the patients were given ICS/LABA or ICS/LABA+LAMA treatment for 3months.The CAT score,age,smoking quantity,pulmonary function indices,body mass index (BMI),6-min walking distance (6MWD),modified medical British research council (mMRC)dyspnea scale,and the times of acute exacerbation of COPD (AECOPD)in previous one year were collected before and after treatment.The clinical characteristics analysis and correlation analysis were performed.Results The average age of the 81COPD patients was (66.27±8.52)years,with 88.89% being males and 85.19% having smoking history.The proportions of group A,B,C and D were 8.64%,30.86%,4.94%and 55.56% before treatment,respectively.The values of the forced expiratory volume in one second (FEV1),predicted amount as a percentage of FEV1(FEV1%Pred),forced vital capacity(FVC),predicted amount as a percentage of FVC (FVC%Pred),peak expiratory flow (PEF),predicted amount as a percentage of PEF(PEF%Pred),and 6MWD in CAT score≥10groups were significantly less than those in CAT score10 group(P0.05).The above parameters were not significantly different between patients with CAT score being 10-20,20- 30and≥30groups.mMRC scale and times of AECOPD in CAT score≥20groups were significantly higher than those in CAT score10group(P0.05).No significant difference in FEV1/FVC was found in different CAT score groups.The CAT score was significantly correlated with mMRC scale(pre-treatment r2=0.417,P0.001;post-treatment r2=0.19,P0.001), 6MWD (pre-treatment r2=0.320,P0.001;post-treatment r2=0.19,P0.001),pre-treatment FEV1(r2=0.177,P= 0.001 5),FEV1%Pred(r2=0.125,P=0.002),PEF(r2=0.164,P=0.002 4),PEF%Pred(r2=0.129,P=0.007 6),FVC (r2=0.098,P=0.021),FVC%Pred(r2=0.094,P=0.024),FEV1/FVC(r2=0.101,P=0.005 7),and AECOPD number (r2=0.059,P=0.028);and not correlated with the quantity of smoking (r2=0.041,P=0.083),BMI(r2=0.00,P= 0.89),and post-treatment FEV1(r2=0.01,P=0.22)or FEV1%Pred(r2=0.003,P=0.09).Conclusion COPD is prone to occur in the male smokers,with the highest proportion found in group D.CAT score has a good correlation with pre-and post -treatment mMRC scale and exercise capacity,suggesting it has a potential for predicting prognosis of COPD.
- Research Article
5
- 10.3390/healthcare7010012
- Jan 18, 2019
- Healthcare
Rationale/Objective: The Behavioral Risk Factor Surveillance System (BRFSS) health survey has been used to describe the epidemiology of chronic obstructive pulmonary disease (COPD) in the US. Through addressing respiratory symptoms and tobacco use, it could also be used to characterize COPD risk. Methods: Four US states added questions to the 2015 BRFSS regarding productive cough, shortness of breath, dyspnea on exertion, and tobacco duration. We determined COPD risk categories: provider-diagnosed COPD as self-report, high-risk for COPD as ≥10 years tobacco smoking and at least one significant respiratory symptom, and low risk was neither diagnosed COPD nor high risk. Disease burden was defined by respiratory symptoms and health impairments. Data were analyzed using multiple logistic regression models with age as a covariate. Results: Among 35,722 adults ≥18 years, the overall prevalence of COPD and high-risk for COPD were 6.6% and 5.1%. Differences among COPD risk groups were evident based on gender, race, age, geography, tobacco use, health impairments, and respiratory symptoms. Risk for disease was seen early where 3.75% of 25–34 years-old met high-risk criteria. Longer tobacco duration was associated with an increased prevalence of COPD, particularly >20 years. Seventy-nine percent of persons ≥45 years-old with frequent shortness of breath (SOB) reported having or being at risk of COPD, reflecting disease burden. Conclusion: These data, representing nearly 18% of US adults, indicates those at high risk for COPD share many, but not all of the characteristics of persons diagnosed with the disease and demonstrates the value of the BRFSS as a tool to define lung health at a population level.
- Research Article
- 10.3760/cma.j.issn.1673-436x.2018.05.003
- Mar 5, 2018
- Chinese Journal of Asthma
Objective To explore the association between familial aggregation and lung function damage and COPD Assessment Test (CAT) scores in patients with chronic obstructive pulmonary disease (COPD), and the correlation between lung function damage and CAT scores. Methods A prospective analysis was conducted on the patients with COPD in Shengjing Hospital of China Medical University from January 2016 to May 2017.The diagnostic criteria were in line with the 2017 global chronic obstructive pulmonary disease initiative.The gender, age, body mass index, smoking index and family history of the subjects were recorded.The case group was divided into a family history group and a no family history group.CAT score test and lung function test are performed on all subjects (including pulse forced oscillation, plethysmography, pulmonary ventilation, dispersion function detection). We analyzed the association between familial aggregation and lung function and CAT scores in the 2 groups, and the correlation between lung function and CAT scores. Results ①A total of 102 cases were included in the case group, among them, 59 cases were in the family history group (accounting for 57.84%), 43 cases were in the no family history group (accounting for 42.16%). There was no statistically significant difference between the two groups in gender, age, body mass index, smoking index and severity of airway limitation.②There was no statistically significant difference in the other indexes between the two groups except the percentage of the estimated value of carbon monoxide diffusing capacity (P<0.05) and the percentage of the estimated value of carbon monoxide diffusing capacity/alveolar volume (P<0.05). ③There was no statistically significant difference in CAT scores between the family history group and no family history group.④In no family history group, forced expiratory volume in the first second%pred (FEV1%pred) and forced expiratory volume in the first second (FEV1)/forced vital capacity (FVC) were not related to COPD assessment test scores.There was a weak and positive correlation between FEV1%pred, FEV1/FVC and CAT scores in family history group. Conclusions There was no relationship between the familial aggregation of COPD and the lung function and CAT scores.The correlation between lung function and CAT scores is weak. Key words: Chronic obstructive pulmonary disease; Familial aggregation; Lung function; COPD Assessment Test scores
- Research Article
41
- 10.1186/s12890-023-02758-0
- Jan 2, 2024
- BMC Pulmonary Medicine
Chronic obstructive pulmonary disease (COPD) frequently coexists with other chronic diseases, namely comorbidities. They negatively impact prognosis, exacerbations and quality of life in COPD patients. However, no studies have been performed to explore the impact of these comorbidities on COPD clinical control criteria. Determine the relationship between individualized comorbidities and COPD clinical control criteria. Observational, multicenter, cross-sectional study performed in Spain involving 4801 patients with severe COPD (< 50 predicted forced expiratory volume in the first second [FEV1%]). Clinical control criteria were defined by the combination of COPD assessment test (CAT) scores (≤16 vs ≥17) and exacerbations in the previous three months (none vs ≥1). Binary logistic regression adjusted by age and FEV1% was performed to identify comorbidities potentially associated with the lack of control of COPD. Secondary endpoints were the relationship between individualized comorbidities with COPD assessment test and exacerbations within the last three months. Most frequent comorbidities were arterial hypertension (51.2%), dyslipidemia (36.0%), diabetes (24.9%), obstructive sleep apnea-hypopnea syndrome (14.9%), anxiety (14.1%), heart failure (11.6%), depression (11.8%), atrial fibrillation (11.5%), peripheral arterial vascular disease (10.4%) and ischemic heart disease (10.1%). After age and FEV1% adjustment, comorbidities related to lack of clinical control were cardiovascular diseases (heart failure, peripheral vascular disease and atrial fibrillation; p < 0.0001), psychologic disorders (anxiety and depression; all p < 0.0001), metabolic diseases (diabetes, arterial hypertension and abdominal obesity; all p < 0.001), sleep disorders (p < 0.0001), anemia (p = 0.015) and gastroesophageal reflux (p < 0.0001). These comorbidities were also related to previous exacerbations and COPD assessment test scores. Comorbidities are frequent in patients with severe COPD, negatively impacting COPD clinical control criteria. They are related to health-related quality of life measured by the COPD assessment test. Our results suggest that comorbidities should be investigated and treated in these patients to improve their clinical control. Study question: What is the impact of comorbidities on COPD clinical control criteria? Among 4801 patients with severe COPD (27.5% controlled and 72.5% uncontrolled), after adjustment by age and FEV1%, comorbidities related to lack of clinical control were cardiovascular diseases (heart failure, peripheral vascular disease and atrial fibrillation; p < 0.0001), psychologic disorders (anxiety and depression; p < 0.0001), metabolic diseases (diabetes, arterial hypertension and abdominal obesity; p < 0.001), obstructive sleep apnea-hypopnea syndrome (p < 0.0001), anaemia (p = 0.015) and gastroesophageal reflux (p < 0.0001), which were related to previous exacerbations and COPD assessment test scores. Comorbidities are related to health-related quality of life measured by the COPD assessment test scores and history of exacerbations in the previous three months.
- Research Article
25
- 10.1016/j.rmed.2019.03.007
- Mar 21, 2019
- Respiratory medicine
Determinants of CAT (COPD Assessment Test) scores in a population of patients with COPD in central and Eastern Europe: The POPE study.
- Research Article
19
- 10.4274/mmj.galenos.2022.06787
- Jun 1, 2022
- Medeniyet Medical Journal
Objective:In this study, we aimed to investigate the compatibility of modified Medical Research Council (mMRC) and COPD assessment test (CAT) scores of chronic obstructive pulmonary disease (COPD) patients in terms of evaluation of their symptom status.Methods:The study was planned as a single-center, cross-sectional study. Statistically four separate receiver operating characteristic (ROC) curves of CAT scoring were generated for mMRC scores of 1 to 4.Results:Two hundred twenty eight patients with stable COPD, mean age 64.2±8.2 and 88.6% male were included. A strong positive correlation was detected between CAT and mMRC (r=0.60, p<0.001). However, it was observed that 32 patients had mMRC<2 but CAT≥10, while 21 patients had CAT<10 but mMRC≥2. Thus, in 53 patients CAT and mMRC scores were not identical in terms of assessed symptom status. According to the ROC analysis, the mMRC scores of 1 to 4 were most compatible with the CAT scores of 10, 10, 15, and 20, respectively.Conclusions:Expanding current data represents that CAT score of 10 could be more compatible with mMRC score of 1. Moreover we think although a high mMRC or CAT score may be sufficient to assign patients to high symptom groups, it is needed to evaluate mMRC and CAT together to assign a patient to a low symptom group. In this way misclassification of the patients with high symptoms due to insufficient symptom evaluation as if they have low symptoms can be prevented.
- Supplementary Content
45
- 10.4103/1817-1737.128843
- Jan 1, 2014
- Annals of Thoracic Medicine
The Saudi Thoracic Society (STS) launched the Saudi Initiative for Chronic Airway Diseases (SICAD) to develop a guideline for the diagnosis and management of chronic obstructive pulmonary disease (COPD). This guideline is primarily aimed for internists and general practitioners. Though there is scanty epidemiological data related to COPD, the SICAD panel believes that COPD prevalence is increasing in Saudi Arabia due to increasing prevalence of tobacco smoking among men and women. To overcome the issue of underutilization of spirometry for diagnosing COPD, handheld spirometry is recommended to screen individuals at risk for COPD. A unique feature about this guideline is the simplified practical approach to classify COPD into three classes based on the symptoms as per COPD Assessment Test (CAT) and the risk of exacerbations and hospitalization. Those patients with low risk of exacerbation (<2 in the past year) can be classified as either Class I when they have less symptoms (CAT < 10) or Class II when they have more symptoms (CAT ≥ 10). High-risk COPD patients, as manifested with ≥2 exacerbation or hospitalization in the past year irrespective of the baseline symptoms, are classified as Class III. Class I and II patients require bronchodilators for symptom relief, while Class III patients are recommended to use medications that reduce the risks of exacerbations. The guideline recommends screening for co-morbidities and suggests a comprehensive management approach including pulmonary rehabilitation for those with a CAT score ≥10. The article also discusses the diagnosis and management of acute exacerbations in COPD.
- Research Article
9
- 10.1080/07853890.2022.2055134
- Mar 26, 2022
- Annals of Medicine
Purpose Our study aimed to compare the predictive value of the COPD Assessment Test (CAT) score at baseline and short-term change in CAT for future exacerbations in chronic obstructive pulmonary disease (COPD) patients. Methods This was a multicentre prospective study. Patients with COPD were recruited into the study and followed up for one year. CAT score and exacerbation in the previous year were collected at baseline. Change in CAT was defined as CAT score changing between baseline and the 6-month follow-up. Exacerbation was recorded during the one-year follow-up from 0th to 12th month. Result A total of 536 patients were enrolled for final analysis. The mean baseline CAT score was 14.5 ± 6.6 and the median (IQR) change in CAT was −2 (8). On Cox regression analysis, baseline CAT score, change in CAT and history of exacerbation were independent risk factors for exacerbation in the one-year follow-up. Compared with the r value of correlation between baseline CAT score and frequency of exacerbations during the one-year follow-up (r = 0.286, p < .001), that correlation between the change in CAT and frequency of exacerbations during follow-up was higher (r = 0.421, p < .001). The receiver operating characteristic (ROC) curves showed that change in CAT had a better predictive capacity for future exacerbation than baseline CAT (0.789 versus 0.609, p = .001). The ROC showed that change in CAT also had a better predictive capacity for future exacerbation than exacerbation in the previous year (0.789 versus 0.689, p = .011). Conclusion The correlation between baseline CAT score and future exacerbation was weak, however, the correlation between change in CAT and future exacerbation was moderate. Change in CAT in the short term had a better predictive value for future exacerbations of COPD than baseline CAT and exacerbation in the previous year.
- Research Article
- 10.12729/jbr.2015.16.4.134
- Dec 1, 2015
- Journal of Biomedical Research
Chronic obstructive pulmonary disease (COPD) is associated with multiple comorbidities, including depression, which carries a higher risk of exacerbation and hospitalization in patients with stable COPD. A newly developed questionnaire, the COPD Assessment Test (CAT), was developed as an alternative to other complex, time-consuming tools for quantifying the symptom burden of COPD in routine practice. It is possible that the correlation between the CAT and depression scales could be useful for early evaluation and management of depression in COPD patients. Thus, we investigated the relationship between the CAT and depression as measured by the Patient Health Questionnaires-9 (PHQ-9). We performed a retrospective observational COPD cohort study. A total of 97 patients were enrolled. The Korean versions of the CAT and PHQ-9 were completed for stable patients. A correlation analysis was performed between the PHQ-9 and CAT scores. Significant depression among the groups based on the 2011 GOLD guidelines occurred only in class Gold B and D patients (40% and 60%, respectively). The frequency of depression was significantly higher in the group with higher CAT scores (20~29 versus ≥30; odds ratio: 5.67 versus 22.66). Significant association was observed between the PHQ-9 and CAT scores (r=0.545 and P<0.001). As a result, the PHQ-9 score was significantly higher in COPD patients with a higher CAT score. The CAT is a simple and valuable predictor of depression in COPD patients, and it should be frequently used to detect COPD patients with depression in clinical practice.
- Front Matter
12
- 10.1016/s0140-6736(09)61535-x
- Aug 1, 2009
- The Lancet
COPD—more than just tobacco smoke