Abstract Limited and missing socioeconomic data have made it nearly impossible to measure or estimate inequality consistently at fine spatiotemporal and jurisdictional scales, especially for lower- and middle-income countries (L&MICs). We deploy a novel data harmonization method that combines existing household survey data with freely available remotely sensed data and machine learning techniques to generate fine-scale socioeconomic inequality measures across spatial and temporal scales for India. Our manuscript makes three important contributions. One, it identifies key remote sensing datasets that, in combination with nighttime luminosity, improve its predictive power to estimate measures of socioeconomic inequality; Two, it offers an analytical approach that reliably estimates the uneven distribution of socioeconomic conditions by harmonizing household assets and socio-demographic information that remotely sensed data at the village or similar geographic levels represent – the results achieve > 84% prediction accuracy. Finally, it leverages a spatially cross-validated machine learning model with training and test datasets from two successive Demographic & Health Surveys (DHS) to demonstrate how data gaps in socioeconomic inequality at subnational levels can be addressed. Our replicable approach has the potential to improve global inequality data, thereby supporting research and applications aiming to reduce socioeconomic inequality in the context of the Sustainable Development Goals (SDGs).
Read full abstract