Abstract

Background: Over the last two decades, latent class analysis (LCA) has been extensively used to statistically identify phenotypes of childhood wheeze; however, less has been discussed on the external factors affecting the outcomes of LCA models in phenotype identification research. Objective: To understand the potential cause of variability in the number/types of wheeze phenotypes identified by means of LCA and to assess the performance of these models under different conditions. Methods: Here, we examined the effects of several data characteristics (sample size, data collection age and intervals), and model dimensionality (number of latent classes) on the classification of wheeze phenotypes by means of latent class and transition analysis. We also identified critical and redundant data collection points in discriminating phenotypes of wheeze using two different variable selection methods, stepwise-backward elimination, and genetic algorithm. Results: Small sample sizes suggested fewer wheeze sub-groups as they were underpowered to detect less frequently observed wheeze phenotypes. There was a strong interplay between sample size, number of data collection points and the optimal number of wheeze phenotypes extracted. A variable selection algorithm applied to LCA resulted in the exclusion of 6 redundant/non-informative time points (Month 6, Month 30, Month 69, Month 91, Month 140 and Month 198) as they contained very little discriminative and/or extra information about the wheeze classes. Conclusions: Input data characteristics and model complexity were found to be very influential on the classification of wheeze phenotypes and prevalence of each class in the population.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call