Abstract

Information describing the elements of urban landscape is a required input data to study numerous physical processes (e.g climate, noise, air pollution). However, the accessibility and quality of urban data is heterogeneous across the world. As an example, a major open-source geographical data project (OpenStreetMap) demonstrates incomplete data regarding key urban properties such as building height. The present study implements and evaluates a statistical approach which models the missing values of building height in OpenStreetMap. A Random Forest method is applied to estimate building height based on building’s closest environment. 62 geographical indicators are calculated with the GeoClimate tool and used as independent variables. A training data set of 14 French communes is selected, and the reference building height is provided by the BDTopo IGN. An optimized Random Forest algorithm is proposed, and outputs are compared with an evaluation dataset. At building scale for all cities, at least 50 % of the buildings have their height estimated with an error being less than 4 m (the city median building height ranges from 4.5 m to 18 m). Two communes (Paris and Meudon) demonstrate building height results out of the main trend due to their specific urban fabric. Putting aside these two communes and when building height is averaged at regular grid scale (100 m × 100 m), the median absolute error is 1.6 m and at least 75 % of the cells of any city have an error lower than 3.2 m. This level of magnitude is quite reasonable when compared to the accuracy of the reference data (at least 50 % of the buildings have an height uncertainty equal to 5 m). This work offers insights about the estimation of missing urban data using statistical method and contributes to the use of open-source data set based on open-source software. The software used to produce the data is freely available at https://zenodo.org/record/6372337 and the data set can be freely accessed at https://zenodo.org/record/6396361.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call