Abstract This article presents a geospatial study case on electricity theft. The main objective is to identify the degree of correlation between exogenous variables and areas with a high density of irregular cases. Firstly, the geospatial study is carried out to asses the null hypothesis and check whether the data pattern presents clustering, for this the ANN method is applied, which ruled out the null hypothesis for the data set. Once the clustering pattern is confirmed, the spatial weight matrix is created to study spatial autocorrelation by applying Global Moran’s I and Local Moran’s I. Moran scatterplot is used to evaluate the degree of fitness, identify outliers, and local pockets of stationarity. The Local Moran index is used to determine the location of the clusters and the relationship between the points. In the data pre-processing step, spatial interpolation is implemented to the exogenous variables as a tool to better association of consumer units points and socioeconomic variables, the method utilized is IDW interpolation. The R-squared value of the spatial lag model after model tuning by feature selection was 87 % indicating that the model fit the observed data well.
Read full abstract