Abstract
Along with rapid urbanization, the growth and persistence of slums is a global challenge. While remote sensing imagery is increasingly used for producing slum maps, only a few studies have analyzed their temporal dynamics. This study explores the potential of fully convolutional networks (FCNs) to analyze the temporal dynamics of small clusters of temporary slums using very high resolution (VHR) imagery in Bangalore, India. The study develops two approaches based on FCNs. The first approach uses a post-classification change detection, and the second trains FCNs to directly classify the dynamics of slums. For both approaches, the performances of 3 × 3 kernels and 5 × 5 kernels of the networks were compared. While classification results of individual years exhibit a relatively high F1-score (3 × 3 kernel) of 88.4% on average, the change accuracies are lower. The post-classification results obtained an F1-score of 53.8% and the change-detection networks obtained an F1-score of 53.7%. According to the trajectory error matrix (TEM), the post-classification results scored higher for the overall accuracy but lower for the accuracy difference of change trajectories than the change-detection networks. Although the two methods did not have significant differences in terms of accuracy, the change-detection network was less noisy. Within our study area, the areas of slums show a small overall decrease; the annual growth of slums (between 2012 and 2016) was 7173 m2, in contrast to an annual decline of 8390 m2. However, these numbers hid the spatial dynamics, which were much larger. Interestingly, areas where slums disappeared commonly changed into green areas, not into built-up areas. The proposed change-detection network provides a robust map of the locations of changes with lower confidence about the exact boundaries. This shows the potential of FCNs for detecting the dynamics of slums in VHR imagery.
Highlights
More than half of the world’s population resides in urban settlements, with an expected increase to 68% by 2050 [1]
This study aims to explore the potential of fully convolutional networks (FCNs) to analyze the temporal dynamics of temporary slum areas based on very high resolution (VHR) imagery in Bangalore, India
Using an FCN architecture with dilated convolutions, we found that a 3 × 3 network had slightly better accuracy (88.38%) compared with that of a 5 × 5 network (86.32%)
Summary
More than half of the world’s population resides in urban settlements, with an expected increase to 68% by 2050 [1]. The lack of cities’ capacity to meet this sharply increasing housing demand, combined with the inability to provide basic services, drives the growth and persistence of slums [2]. Image-based conceptualization of slums often refers to building characteristics, such as roof materials, shape, and density [6] Such characteristics can be used for slum identification from remote sensing imagery. With these physical characteristics, slums can be detected and monitored. Such maps provide consistent and updateable slum information compared with that of a national census, knowing that census data are often very uncertain, quickly outdated, and usually cover only parts of the slums [7]
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have