Classification and regression trees for epidemiologic research: an air pollution example.

Katherine Gass,Matthew J Strickland,Howard H Chang,W Dana Flanders,Mitch Klein

doi:10.1186/1476-069x-13-17

Abstract

BackgroundIdentifying and characterizing how mixtures of exposures are associated with health endpoints is challenging. We demonstrate how classification and regression trees can be used to generate hypotheses regarding joint effects from exposure mixtures.MethodsWe illustrate the approach by investigating the joint effects of CO, NO2, O3, and PM2.5 on emergency department visits for pediatric asthma in Atlanta, Georgia. Pollutant concentrations were categorized as quartiles. Days when all pollutants were in the lowest quartile were held out as the referent group (n = 131) and the remaining 3,879 days were used to estimate the regression tree. Pollutants were parameterized as dichotomous variables representing each ordinal split of the quartiles (e.g. comparing CO quartile 1 vs. CO quartiles 2–4) and considered one at a time in a Poisson case-crossover model with control for confounding. The pollutant-split resulting in the smallest P-value was selected as the first split and the dataset was partitioned accordingly. This process repeated for each subset of the data until the P-values for the remaining splits were not below a given alpha, resulting in the formation of a “terminal node”. We used the case-crossover model to estimate the adjusted risk ratio for each terminal node compared to the referent group, as well as the likelihood ratio test for the inclusion of the terminal nodes in the final model.ResultsThe largest risk ratio corresponded to days when PM2.5 was in the highest quartile and NO2 was in the lowest two quartiles (RR: 1.10, 95% CI: 1.05, 1.16). A simultaneous Wald test for the inclusion of all terminal nodes in the model was significant, with a chi-square statistic of 34.3 (p = 0.001, with 13 degrees of freedom).ConclusionsRegression trees can be used to hypothesize about joint effects of exposure mixtures and may be particularly useful in the field of air pollution epidemiology for gaining a better understanding of complex multipollutant exposures.

Highlights

Identifying and characterizing how mixtures of exposures are associated with health endpoints is challenging
Each terminal node represents a subset of days with a specific pattern of pollutants that the algorithm could not split further, conditional on the confounders included in the model
Many issues arise when dealing with mixtures, including exposure covariation, physiological and chemical interaction, joint effects, and novel exposure metrics

Summary

Introduction

Identifying and characterizing how mixtures of exposures are associated with health endpoints is challenging. Throughout the course of a day and lifetime our total exposure can be conceptualized as a complex mixture of different individual exposures. Advances in science have improved our ability to measure these exposures; a major challenge is how best to characterize and relate these mixtures to health endpoints. Characterization of mixtures for epidemiologic research depends upon both the data that can be obtained as well as Statistical interaction is often assessed by including the product of two or more risk factors (exposures) in a regression model and using statistical tests to determine whether the resulting coefficient differs significantly from zero. As model complexity grows so does the challenge of interpretation [2]. Parameter estimates may become unstable as the number of interaction terms increases

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Environmental Health	Publication Date: Mar 13, 2014
Citations: 91	License type: cc-by

R Discovery Prime

R Discovery Prime

Classification and regression trees for epidemiologic research: an air pollution example.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Environmental Health

Lead the way for us

Similar Papers

Utilizing Regression Trees to Identify Complex Patterns of Multipollutant Joint Effects
Katherine Gass ... Howard Chang
ISEE Conference Abstracts | VOL. 2013
Katherine Gass, et. al.Katherine Gass ... Howard Chang
19 Sep 2013
ISEE Conference Abstracts | VOL. 2013

Associations between ambient air pollutant mixtures and pediatric asthma emergency department visits in three cities: a classification and regression tree approach.
Katherine Gass ... James A Mulholland
Environmental Health | VOL. 14
Katherine Gass, et. al.Katherine Gass ... James A Mulholland
27 Jun 2015
Environmental Health | VOL. 14

Detection of endometriosis with the use of plasma protein profiling by surface-enhanced laser desorption/ionization time-of-flight mass spectrometry
Haiyuan Liu ... Qizhai Li
Fertility and Sterility | VOL. 87
Haiyuan Liu, et. al.Haiyuan Liu ... Qizhai Li
04 Jan 2007
Fertility and Sterility | VOL. 87

KLASIFIKASI KARAKTERISTIK KECELAKAAN LALU LINTAS DI KOTA DENPASAR DENGAN PENDEKATAN CLASSIFICATION AND REGRESSION TREES (CART)
I Gede Agus Jiwadiana ... I Komang Gde Sukarsa
E-Jurnal Matematika | VOL. 4
I Gede Agus Jiwadiana, et. al.I Gede Agus Jiwadiana ... I Komang Gde Sukarsa
24 Nov 2015
E-Jurnal Matematika | VOL. 4

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Classification and regression trees for epidemiologic research: an air pollution example.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Environmental Health