BackgroundEnvironmental factors have been associated with adverse health effects in epidemiological studies. The main exposure variable is usually determined via prior knowledge or statistical methods. It may be challenging when evidence is scarce to support prior knowledge, or to address collinearity issues using statistical methods. This study aimed to investigate the importance level of environmental variables for the under-five mortality in Malaysia via random forest approach. MethodWe applied a conditional permutation importance via a random forest (CPI-RF) approach to evaluate the relative importance of the weather- and air pollution-related environmental factors on daily under-five mortality in Malaysia. This study spanned from January 1, 2014 to December 31, 2016. In data preparation, deviation mortality counts were derived through a generalized additive model, adjusting for long-term trend and seasonality. Analyses were conducted considering mortality causes (all-cause, natural-cause, or external-cause) and data structures (continuous, categorical, or all types [i.e., include all variables of continuous type and all variables of categorical type]). The main analysis comprised of two stages. In Stage 1, Boruta selection was applied for preliminary screening to remove highly unimportant variables. In Stage 2, the retained variables from Boruta were used in the CPI-RF analysis. The final importance value was obtained as an average value from a 10-fold cross-validation. ResultSome heat-related variables (maximum temperature, heat wave), temperature variability, and haze-related variables (PM10, PM10-derived haze index, PM10- and fire-derived haze index, fire hotspot) were among the prominent variables associated with under-five mortality in Malaysia. The important variables were consistent for all- and natural-cause mortality and sensitivity analyses. However, different most important variables were observed between natural- and external-cause under-five mortality. ConclusionHeat-related variables, temperature variability, and haze-related variables were consistently prominent for all- and natural-cause under-five mortalities, but not for external-cause.
Read full abstract