Abstract

Background Like other branches of epidemiology and public health, environmental epidemiology faces the significant challenge of teasing out the health effects of a large number of interplaying risk factors characterized by small relative risks. The prospect of combining health and risk factor data across multiple cohorts and exposure databases is increasingly considered as a cost-effective way of providing new analytical opportunities for research in environmental epidemiology - opportunities that are often unavailable through the use of individual studies. Integrating data across studies has the potential to provide the statistical power required to obtain more robust estimates of health risks, to identify sub-populations of vulnerability, and to explore statistical interactions between exposures and other risk factors, thereby improving our understanding of the environmental determinants of health and disease. While integrating existing data can provide a number of advantages, it also requires considerable effort and presents important methodological and technical challenges. Aims and Objectives The general objective of this thesis work was to (1) outline tools and resources which facilitate collaborative large-scale epidemiological research projects, and (2) combine data from large observational cohorts and ambient air pollution exposure databases to explore associations of ambient air pollution exposure with respiratory health outcomes. Methods First, this work describes a number of methodological resources and software tools I have helped shape while working at Maelstrom Research, an interdisciplinary team of epidemiologists, statisticians, and computer scientists. I then show how these open-source and freely available tools can provide innovative solutions to facilitate collaborative epidemiological research by addressing issues of data documentation and discoverability, data harmonization, and data integration and co-analysis. Secondly, using tools described in the first section, data from two of Europe’s largest observational cohorts - the LifeLines cohort study and UK Biobank – are harmonized and linked to ambient air pollution exposure databases to explore associations between ambient air pollution exposure and respiratory symptoms, in populations as a whole and in potentially vulnerable population subgroups. Using data from the UK Biobank, the associations of air pollution exposure with lung function and chronic obstructive pulmonary disease (COPD) are also investigated. Potential vulnerability factors of this relationships are explored. Finally, to evaluate the sensitivity of different air pollution exposure databases in estimating health effects, we compared associations of air pollution with lung function and COPD in UK Biobank using two different air pollution exposure databases. Results An open-source software application suite and a standardized metadata model for observational cohort studies are proposed to support documentation and dissemination of study metadata across collaborating institutions and to assist in the considerable task of harmonizing data across studies. Connecting the software via secure web services allows epidemiological consortia to implement a federated database infrastructure that supports seamless integration and co-analysis of study data while assuring the privacy of participants. Finally, these tools are deployed to meet the collaborative research needs of an international epidemiological research consortium to catalogue, harmonize and co-analyse individual-level data collected by multiple longitudinal cohort studies and link them to databases of area-level environmental exposures. Putting these tools into application to conduct multicentre research projects in air pollution and respiratory health resulted in well-powered studies which showed clinically meaningful associations between particulate matter and nitrogen dioxide exposure and respiratory health. Positive associations between ambient air pollution exposure and respiratory symptoms as well as COPD prevalence were found. Results also showed significant negative associations between exposure to outdoor air pollutants and lung function. Strong statistical power allowed subgroup analyses, suggesting consistently stronger associations of air pollution exposure with respiratory symptoms, lung function and COPD among participants with lower incomes compared to those with higher incomes. Lastly, comparing associations using different air pollution exposure databases resulted in marked changes in exposure-response coefficients, suggesting that use of different exposure assessment methods can be an important source of heterogeneity in multicentre projects. Conclusion The tools and resources presented in this thesis provide the epidemiological research community with free and open-source solutions to enhance the use - particularly collaborative use - of epidemiological research data. The harmonization and co-analysis of large environmental exposure and health datasets can provide the statistical power required to overcome the limitations of previous studies in environmental epidemiology related to random exposure misclassification coupled with relatively small excess risks. Effectively combining datasets can lead to more robust estimates of the effects of environmental exposures on health, a better understanding of interactions between risk factors, and identification of high-risk population subgroups. In turn, these factors can considerably improve our comprehension of exposure-response relationships which help define the public health response to environmental health risks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.