Abstract
‘Asthma’ is a complex disease that encapsulates a heterogeneous group of phenotypes and endotypes. Research to understand these phenotypes has previously been based on longitudinal wheeze patterns or hypothesis-driven observational criteria. The aim of this study was to use data-driven machine learning to identify asthma and wheeze phenotypes in children based on symptom and symptom history data, and, to further characterize these phenotypes. The study population included an asthma-rich population of twins in Sweden aged 9–15 years (n = 752). Latent class analysis using current and historical clinical symptom data generated asthma and wheeze phenotypes. Characterization was then performed with regression analyses using diagnostic data: lung function and immunological biomarkers, parent-reported medication use and risk-factors. The latent class analysis identified four asthma/wheeze phenotypes: early transient wheeze (15%); current wheeze/asthma (5%); mild asthma (9%), moderate asthma (10%) and a healthy phenotype (61%). All wheeze and asthma phenotypes were associated with reduced lung function and risk of hayfever compared to healthy. Children with mild and moderate asthma phenotypes were also more likely to have eczema, allergic sensitization and a family history of asthma. Furthermore, those with moderate asthma phenotype had a higher eosinophil concentration (β 0.21, 95%CI 0.12, 0.30) compared to healthy and used short-term relievers at a higher rate than children with mild asthma phenotype (RR 2.4, 95%CI 1.2–4.9). In conclusion, using a data driven approach we identified four wheeze/asthma phenotypes which were validated with further characterization as unique from one another and which can be adapted for use by the clinician or researcher.
Highlights
Asthma is a heterogeneous disease often characterized by wheeze, cough, chest tightness and shortness of breath caused by multiple triggers, and changes over the life course [1]
The best fitting latent class analysis (LCA) was the 5 class model based on the lowest Bayesian information criteria (BIC) and Aikake information criteria (AIC), the Lo Mendell Rubin test (LMR) which suggested that this model was significantly different to the 4 class model, and the entropy index approaching 1 (S2 Table)
The five phenotypes were given labels to best describe the profile of conditional probabilities: ‘Healthy’, ‘Early transient wheeze’, ‘Current wheeze/asthma’, ‘Mild asthma’ and Moderate asthma’
Summary
Asthma is a heterogeneous disease often characterized by wheeze, cough, chest tightness and shortness of breath caused by multiple triggers, and changes over the life course [1]. A number of modern data-driven machine learning approaches have been used to identify phenotypes such as latent class analysis (LCA) [9, 10]. The data-driven approach is hypothesis-free relying on the statistical model to generate clusters of phenotypes based on the variables added to the model rather than pre-formulated hypotheses, and has been shown to be appropriate for use in complex diseases such as asthma [9]. The aim of this study was to first use data driven approach to identify asthma and wheeze phenotypes based on symptom history data and secondly to confirm that these phenotypes were relevant for clinicians and researchers by further characterization using diagnostic tests, biomarkers, asthma medication and risk factor history information
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have