Abstract

This letter proposes a novel method to obtain panoptic predictions by extending the semantic segmentation task with a few non-learning image processing steps, presenting the following benefits: (1) annotations do not require a specific format (e.g., COCO); (2) fewer parameters (e.g., single loss function and no need for object detection parameters); and (3) a more straightforward sliding windows implementation for large image classification (still unexplored for panoptic segmentation). Semantic segmentation models do not individualize touching objects, as their predictions can merge, i.e., a single polygon represents many targets. Our method overcomes this problem by isolating the objects using borders on the polygons that may merge. The data preparation requires generating a 1-pixel border, and for unique object identification, we create a list with the isolated polygons, attribute a different value to each one, and use the expanding border (EB) algorithm for those with borders. Although any semantic segmentation model applies, we used the U-Net with three backbones (EfficientNet-B5, EfficientNet-B3, and EfficientNet-B0). The results show that (1) the EfficientNet-B5 had the best results with 70% mIoU; (2) the EB algorithm presented better results for better models; (3) the panoptic metrics show a high capability of identifying things and stuff with 65 Panoptic Quality; and (4) the sliding windows on a 2560x2560 pixel area has shown promising results, in which the ratio of merged objects by correct predictions was lower than 1% for all classes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.