Rethinking Panoptic Segmentation in Remote Sensing: A Hybrid Approach Using Semantic Segmentation and Non-Learning Methods

Osmar L F De Carvalho,Osmar A De Carvalho Junior,Dibio L Borges,Nickolas C Santana,Anesmar O De Albuquerque

doi:10.1109/lgrs.2022.3172207

Osmar L F De Carvalho, Osmar A De Carvalho Junior + Show 3 more

https://doi.org/10.1109/lgrs.2022.3172207

Copy DOI

Abstract

This letter proposes a novel method to obtain panoptic predictions by extending the semantic segmentation task with a few non-learning image processing steps, presenting the following benefits: (1) annotations do not require a specific format (e.g., COCO); (2) fewer parameters (e.g., single loss function and no need for object detection parameters); and (3) a more straightforward sliding windows implementation for large image classification (still unexplored for panoptic segmentation). Semantic segmentation models do not individualize touching objects, as their predictions can merge, i.e., a single polygon represents many targets. Our method overcomes this problem by isolating the objects using borders on the polygons that may merge. The data preparation requires generating a 1-pixel border, and for unique object identification, we create a list with the isolated polygons, attribute a different value to each one, and use the expanding border (EB) algorithm for those with borders. Although any semantic segmentation model applies, we used the U-Net with three backbones (EfficientNet-B5, EfficientNet-B3, and EfficientNet-B0). The results show that (1) the EfficientNet-B5 had the best results with 70% mIoU; (2) the EB algorithm presented better results for better models; (3) the panoptic metrics show a high capability of identifying things and stuff with 65 Panoptic Quality; and (4) the sliding windows on a 2560x2560 pixel area has shown promising results, in which the ratio of merged objects by correct predictions was lower than 1% for all classes.

Full Text