Earth observation satellites offer vast opportunities for quantifying landscapes and regional land cover composition and changes. The integration of artificial intelligence in remote sensing is essential for monitoring significant land cover types like forests, demanding a substantial volume of labeled data for effective AI model development and validation. The Wald5Dplus project introduces a distinctive open benchmark dataset for mid-European forests, labeling Sentinel-1/2 time series using data from airborne laser scanning and multi-spectral imagery. The freely accessible satellite images are fused in polarimetric, spectral, and temporal domains, resulting in analysis-ready data cubes with 512 channels per year on a 10 m UTM grid. The dataset encompasses labels, including tree count, crown area, tree types (deciduous, coniferous, dead), mean crown volume, base height, tree height, and forested area proportion per pixel. The labels are based on an individual tree characterization from high-resolution airborne LiDAR data using a specialized segmentation algorithm. Covering three test sites (Bavarian Forest National Park, Steigerwald, and Kranzberg Forest) and encompassing around six million trees, it generates over two million labeled samples. Comprehensive validation, including metrics like mean absolute error, median deviation, and standard deviation, in the random forest regression confirms the high quality of this dataset, which is made freely available.
Read full abstract