Abstract
BackgroundMapping disease rates over a region provides a visual illustration of underlying geographical variation of the disease and can be useful to generate new hypotheses on the disease aetiology. However, methods to fit the popular and widely used conditional autoregressive (CAR) models for disease mapping are not feasible in many applications due to memory constraints, particularly when the sample size is large. We propose a new algorithm to fit a CAR model that can accommodate both individual and group level covariates while adjusting for spatial correlation in the disease rates, termed indiCAR. Our method scales well and works in very large datasets where other methods fail.ResultsWe evaluate the performance of the indiCAR method through simulation studies. Our simulation results indicate that the indiCAR provides reliable estimates of all the regression and random effect parameters. We also apply indiCAR to the analysis of data on neutropenia admissions in New South Wales (NSW), Australia. Our analyses reveal that lower rates of neutropenia admissions are significantly associated with individual level predictors including higher age, male gender, residence in an outer regional area and a group level predictor of social disadvantage, the socio-economic index for areas. A large value for the spatial dependence parameter is estimated after adjusting for individual and area level covariates. This suggests the presence of important variation in the management of cancer patients across NSW.ConclusionsIncorporating individual covariate data in disease mapping studies improves the estimation of fixed and random effect parameters by utilizing information from multiple sources. Health registries routinely collect individual and area level information and thus could benefit by using indiCAR for mapping disease rates. Moreover, the natural applicability of indiCAR in a distributed computing framework enhances its application in the Big Data domain with a large number of individual/group level covariates. CI NSW Study Reference Number: 2012/07/410. Dated: July 2012.
Highlights
Mapping disease rates over a region provides a visual illustration of underlying geographical variation of the disease and can be useful to generate new hypotheses on the disease aetiology
We explore whether there is any spatial variation in the rates of neutropenia admissions after adjusting for patients’ individual and clinical characteristics
The Accessibility/Remoteness Index of Australia (ARIA) variable was recorded at individual level rather than postal area level because the ARIA index varies within postal areas
Summary
Mapping disease rates over a region provides a visual illustration of underlying geographical variation of the disease and can be useful to generate new hypotheses on the disease aetiology. Huque et al Int J Health Geogr (2016) 15:25 and risk factors in the presence of geographical variation [6] These models can adjust for region specific spatial random effects for correlated disease rates and both individual- and region-specific covariates. The fitting of such models is subject to high computational burden, when the sample size is large and when the number of individual and group level covariates are large To alleviate such problems, investigators often adjust for the age and sex distribution of the underlying population through calculation of an offset in the model [7]. The effect of age and sex on disease risk can not be estimated from these models Such an approach ignores a large number of potential individual level covariates that may be related to the underlying disease process and readily available in health registries
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.