Abstract

Urban land use on a building instance level is crucial geo-information for many applications yet challenging to obtain. Steet-level images are highly suited to predict building functions as the building façades provide clear hints. Social media image platforms contain billions of images, including but not limited to street perspectives. This study proposes a filtering pipeline to yield high-quality, ground-level imagery from large-scale social media image datasets to cope with this issue. The pipeline ensures all resulting images have complete and valid geotags with a compass direction to relate image content and spatial objects.We analyze our method on a culturally diverse social media dataset from Flickr with more than 28 million images from 42 cities worldwide. The obtained dataset is then evaluated in the context of a building function classification task with three classes: Commercial, residential, and other. Fine-tuned state-of-the-art architectures yield F1 scores of up to 0.51 on the filtered images. Our analysis shows that the quality of the labels from OpenStreetMap limits the performance. Human-validated labels increase the F1 score by 0.2. Therefore, we consider these labels weak and publish the resulting images from our pipeline and the depicted buildings as a weakly labeled dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call