Abstract
Introduction. Detecting interactions among multiple sources of pollution is a complex analysis often requiring multiple testing. Data mining techniques may be a useful screening method to identify variables (i.e. single sources and/or interactions) potentially associated to the outcome. In this study, a combination of a data mining method with a classical statistical approach was used to explore and estimate the possible interactions between seven sources of pollution. Methods. A nested case-control of 6,459 subjects (1:2 matching by gender, age, socioeconomic status and person-years) was extracted from a cohort of 50,000 people living in an area of Tuscany (Italy) in 2001-2010 linked to hospitalization for respiratory diseases. Three exposure classes were defined for each source using the 50th and 80th percentiles of the concentrations of PM10 and Cd. Variables related to interactions between sources were defined by taking into account all the possible combinations among exposure classes. Classification models developed with a data mining method, random Forest (RF), were carried out and a permutation method was used to select variables best predicting the respiratory outcome. The statistical evidence of the selected variables was assessed by logistic regression. Results. 10 variables, corresponding to potential interactions, were identified by RF. Logistic regression applied to the nested case-control sample confirmed the statistical evidence of all the selected interactions. Logistic regression applied to the entire cohort and adjusted for gender, age, socioeconomic status and person-years, confirmed a statistical evidence for 80% of the selected interactions. Conclusion. The complementary application of a data mining explorative method and a classical statistical approach revealed an efficient method for the identification and estimation of interactions in geographical areas characterized by multiple sources of pollution.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.