Abstract
BackgroundExamining whether disease cases are clustered in space is an important part of epidemiological research. Another important part of spatial epidemiology is testing whether patients suffering from a disease are more, or less, exposed to environmental factors of interest than adequately defined controls. Both approaches involve determining the number of cases and controls (or population at risk) in specific zones. For cluster searches, this often must be done for millions of different zones. Doing this by calculating distances can lead to very lengthy computations. In this work we discuss the computational advantages of geographical grid-based methods, and introduce an open source software (FGBASE) which we have created for this purpose.MethodsGeographical grids based on the Lambert Azimuthal Equal Area projection are well suited for spatial epidemiology because they preserve area: each cell of the grid has the same area. We describe how data is projected onto such a grid, as well as grid-based algorithms for spatial epidemiological data-mining. The software program (FGBASE), that we have developed, implements these grid-based methods.ResultsThe grid based algorithms perform extremely fast. This is particularly the case for cluster searches. When applied to a cohort of French Type 1 Diabetes (T1D) patients, as an example, the grid based algorithms detected potential clusters in a few seconds on a modern laptop. This compares very favorably to an equivalent cluster search using distance calculations instead of a grid, which took over 4 hours on the same computer. In the case study we discovered 4 potential clusters of T1D cases near the cities of Le Havre, Dunkerque, Toulouse and Nantes. One example of environmental analysis with our software was to study whether a significant association could be found between distance to vineyards with heavy pesticide. None was found. In both examples, the software facilitates the rapid testing of hypotheses.ConclusionsGrid-based algorithms for mining spatial epidemiological data provide advantages in terms of computational complexity thus improving the speed of computations. We believe that these methods and this software tool (FGBASE) will lower the computational barriers to entry for those performing epidemiological research.Electronic supplementary materialThe online version of this article (doi:10.1186/1476-072X-13-46) contains supplementary material, which is available to authorized users.
Highlights
Examining whether disease cases are clustered in space is an important part of epidemiological research
Cluster search fits into a hypothesis-free approach, while testing the effect of specific environmental factors fits into a hypothesis-driven approach
Representation of geographical information in FGBASE Choosing a geographical grid standard the software will run with any type of grid, the use of equal area grids based on the Lambert Azimuthal Equal Area (LAEA) projection system is encouraged
Summary
Examining whether disease cases are clustered in space is an important part of epidemiological research Another important part of spatial epidemiology is testing whether patients suffering from a disease are more, or less, exposed to environmental factors of interest than adequately defined controls. Both approaches involve determining the number of cases and controls (or population at risk) in specific zones. Examining whether disease cases are clustered in space is an important part of epidemiological research (see [1,2,3,4,5,6]) Another important part of spatial epidemiology is testing whether patients suffering from a disease are more, or less, exposed to some environmental factors of interest (see [7,8,9,10,11]). After the hypothesis is chosen, the data is examined to see whether or not it supports the hypothesis
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.