Abstract
Objective The heterogeneity of asthma has inspired widespread application of statistical clustering algorithms to a variety of datasets for identification of potentially clinically meaningful phenotypes. There has not been a standardized data analysis approach for asthma clustering, which can affect reproducibility and clinical translation of results. Our objective was to identify common and effective data analysis practices in the asthma clustering literature and apply them to data from a Southern California population-based cohort of schoolchildren with asthma. Methods As of January 1, 2020, we reviewed key statistical elements of 77 asthma clustering studies. Guided by the literature, we used 12 input variables and three clustering methods (hierarchical clustering, k-medoids, and latent class analysis) to identify clusters in 598 schoolchildren with asthma from the Southern California Children’s Health Study (CHS). Results Clusters of children identified by latent class analysis were characterized by exhaled nitric oxide, FEV1/FVC, FEV1 percent predicted, asthma control and allergy score; and were predictive of control at two year follow up. Clusters from the other two methods were less clinically remarkable, primarily differentiated by sex and race/ethnicity and less predictive of asthma control over time. Conclusion Upon review of the asthma phenotyping literature, common approaches of data clustering emerged. When applying these elements to the Children’s Health Study data, latent class analysis clusters—represented by exhaled nitric oxide and spirometry measures—had clinical relevance over time.
Accepted Version
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have