Abstract

Abstract. Single-task learning in artificial neural networks will be able to learn the model very well, and the benefits brought by transferring knowledge thus become limited. In this regard, when the number of tasks increases (e.g., semantic segmentation, panoptic segmentation, monocular depth estimation, and 3D point cloud), duplicate information may exist across tasks, and the improvement becomes less significant. Multi-task learning has emerged as a solution to knowledge-transfer issues and is an approach to scene understanding which involves multiple related tasks each with potentially limited training data. Multi-task learning improves generalization by leveraging the domain-specific information contained in the training data of related tasks. In urban management applications such as infrastructure development, traffic monitoring, smart 3D cities, and change detection, automated multi-task data analysis for scene understanding based on the semantic, instance, and panoptic annotation, as well as monocular depth estimation, is required to generate precise urban models. In this study, a common framework for the performance assessment of multi-task learning methods from fixed-wing UAV images for 2D/3D city modelling is presented.

Highlights

  • 1.1 MotivationIn recent years, the role of traditional methods such as terrestrial mapping and traditional aerial photogrammetry techniques has been dimmed due to the high cost and the need for a long time to generate a multi-task dataset for scene understanding (Crawshaw, 2020; Khoshboresh Masouleh and Shah-Hosseini, 2020; Masouleh and Sadeghian, 2019; Ruder, 2017; Zhang and Yang, 2018)

  • Unmanned Aerial Vehicle (UAV) with a high-resolution digital camera is an efficient tool for data generation, there is still a lack of multi-task datasets for scene understanding (Khoshboresh Masouleh and Shah-Hosseini, 2019)

  • We focus on multitask learning based on semantic segmentation, building panoptic segmentation, and monocular depth estimation

Read more

Summary

Introduction

1.1 MotivationIn recent years, the role of traditional methods such as terrestrial mapping and traditional aerial photogrammetry techniques has been dimmed due to the high cost and the need for a long time to generate a multi-task dataset for scene understanding (Crawshaw, 2020; Khoshboresh Masouleh and Shah-Hosseini, 2020; Masouleh and Sadeghian, 2019; Ruder, 2017; Zhang and Yang, 2018). An affordable and accurate way to generate multitask data is to use the combination of an Unmanned Aerial Vehicle (UAV) with a high-resolution digital camera (e.g., RGB, Multi-spectral, Thermal, or Hyperspectral) and machine learning methods (Bayanlou and Khoshboresh-Masouleh, 2020; Khoshboresh-Masouleh and Hasanlou, 2020). UAV with a high-resolution digital camera is an efficient tool for data generation, there is still a lack of multi-task datasets for scene understanding (Khoshboresh Masouleh and Shah-Hosseini, 2019). A major yet unsolved research topic for accurate 2D/3D city model generation is multi-task learning for scene understanding from high-resolution low-cost photogrammetry and remote sensing data sources (Khoshboresh Masouleh and Saradjian, 2019). We believe all are important datasets for urban scene analysis, but our proposed dataset will be comprised of much larger multi-task data and with more scene complexity regarding the number of objects

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call