WHICH 3D DATA REPRESENTATION DOES THE CROWD LIKE BEST? CROWD-BASED ACTIVE LEARNING FOR COUPLED SEMANTIC SEGMENTATION OF POINT CLOUDS AND TEXTURED MESHES

M. Kölle,V. Walter,U. Soergel,D. Laupheimer,N. Haala

doi:10.5194/isprs-annals-v-2-2021-93-2021

Abstract

Abstract. Semantic interpretation of multi-modal datasets is of great importance in many domains of geospatial data analysis. However, when training models for automated semantic segmentation, labeled training data is required and in case of multi-modality for each representation form of the scene. To completely avoid the time-consuming and cost-intensive involvement of an expert in the annotation procedure, we propose an Active Learning (AL) pipeline where a Random Forest classifier selects a subset of points sufficient for training and where necessary labels are received from the crowd. In this AL loop, we aim on coupled semantic segmentation of an Airborne Laser Scanning (ALS) point cloud and the corresponding 3D textured mesh generated from LiDAR data and imagery in a hybrid manner. Within this work we pursue two main objectives: i) We evaluate the performance of the AL pipeline applied to an ultra-high resolution ALS point cloud and a derived textured mesh (both benchmark datasets are available at https://ifpwww.ifp.uni-stuttgart.de/benchmark/hessigheim/default.aspx). ii) We investigate the capabilities of the crowd regarding interpretation of 3D geodata and observed that the crowd performs about 3 percentage points better when labeling meshes compared to point clouds. We additionally demonstrate that labels received solely by the crowd can power a machine learning system only differing in Overall Accuracy by less than 2 percentage points for the point cloud and less than 3 percentage points for the mesh, compared to using the completely labeled training pool. For deriving this sparse training set, we ask the crowd to label 0.25 % of available training points, resulting in costs of 190 &amp;dollar;.

Highlights

In recent years, significant effort was put into developing and advancing automatic Machine Learning (ML) methods such as Convolutional Neural Networks (CNNs) for various data representations, as for 2D imagery (Ronneberger et al, 2015; Badrinarayanan et al, 2017) or 3D point clouds (Qi et al, 2017; Graham et al, 2018)
We aim to evaluate whether we can ease interpretability of sampled points by further applying the method proposed in Kolle et al (2021), denoted as Reducing Interpretation Uncertainty (RIU)
We first discuss the conducted experiments relying on real crowdworkers and some details regarding the crowd campaigns, which run in parallel to our Active Learning (AL) loops

Summary

Introduction

Significant effort was put into developing and advancing automatic Machine Learning (ML) methods such as Convolutional Neural Networks (CNNs) for various data representations, as for 2D imagery (Ronneberger et al, 2015; Badrinarayanan et al, 2017) or 3D point clouds (Qi et al, 2017; Graham et al, 2018). Tremendous exertion was made for establishing massive annotated data corpera such as ImageNet (Deng et al, 2009). Since manual annotation of about 14 million images by experts is infeasible, this dataset was mainly built up by the available workforce of individual crowdworkers on the internet. Compared to annotating images of everyday life scenes, interpretation of geospatial data by non-experts (i.e., the crowd) is far more demanding due to an unfamiliar perspective (i.e., nadir-like bird view). This complexity is further intensified when focusing on 3D data, which non-experts might have never dealt with before. When a semantic segmentation of 3D data is desired, working directly with the original data is most reasonable in order to avoid loss of information, for instance, by projection to a lower dimensional space. Herfort et al (2018), Walter and Soergel (2018)

Objectives

Results

Conclusion

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences	Publication Date: Jun 17, 2021
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

WHICH 3D DATA REPRESENTATION DOES THE CROWD LIKE BEST? CROWD-BASED ACTIVE LEARNING FOR COUPLED SEMANTIC SEGMENTATION OF POINT CLOUDS AND TEXTURED MESHES

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences

Lead the way for us

Similar Papers

Automatic Object Extraction from Airborne Laser Scanning Point Clouds for Digital Base Map Production

-

17 Feb 2021
17 Feb 2021

Remembering Both the Machine and the Crowd When Sampling Points: Active Learning for Semantic Segmentation of ALS Point Clouds
Michael Kölle ... Volker Walter
-
Michael Kölle, et. al.Michael Kölle ... Volker Walter
01 Jan 2020
01 Jan 2020

DGCNN Network Architecture With Densely Connected Point Pairs in Multiscale Local Regions for ALS Point Cloud Classification
Yang Chen ... Yaming Xu
IEEE transactions on geoscience and remote sensing : a publication of the IEEE Geoscience and Remote Sensing Society | VOL. 19
Yang Chen, et. al.Yang Chen ... Yaming Xu
01 Jan 2021
IEEE transactions on geoscience and remote sensing : a publication of the IEEE Geoscience and Remote Sensing Society | VOL. 19

A Comparison of ALS and Dense Photogrammetric Point Clouds for Individual Tree Detection in Radiata Pine Plantations
Irfan A Iqbal ... Arko Lucieer
Remote sensing | VOL. 13
Irfan A Iqbal, et. al.Irfan A Iqbal ... Arko Lucieer
06 Sep 2021
Remote sensing | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

WHICH 3D DATA REPRESENTATION DOES THE CROWD LIKE BEST? CROWD-BASED ACTIVE LEARNING FOR COUPLED SEMANTIC SEGMENTATION OF POINT CLOUDS AND TEXTURED MESHES

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences