Image Object Extraction Based on Semantic Segmentation and Label Loss

Xiaoru Wang,Peirong Xu,Fu Li,Zhihong Yu

doi:10.1109/access.2020.2999942

Abstract

Object extraction refers to the operation of obtaining an object area from an image based on a small amount of mark information given by users, which is a key step in image processing. In order to obtain a complete object profile, current methods usually require a large number of manual annotations, especially for objects with irregular contours. Since traditional algorithms rely on low-level pixel features without semantic information, and are based on obvious mathematical assumptions (ie, strong inductive bias), it is difficult to completely identify objects. At present, in order to improve the integrity of object extraction, semantic segmentation-based methods increase the complexity and latancy by adding more pre-processing and post-processing steps. In this paper, we propose a novel model named IOEBSS, which includes a fast binary plane pre-processing, an improved Deeplab v3+ semantic segmentation model, and an auxiliary loss function named Label Loss. Through the fast binary plane pre-processing, the model can accelerate the transformation of interactive inputs. The improved semantic segmentation model makes the extracted results more semantically complete, and Label Loss is more conducive to gradient flow and accelerates training convergence. For the above reasons, IOEBSS can accurately and quickly identify objects with complex contours and colors. On Pascal VOC and COCO datasets, compared to current methods, IOEBSS has a significant improvement in accuracy, inference speed, and convergence speed.

Highlights

Object extraction is a key operation in image processing
The object extraction task with interactive inputs is less complex than general semantic segmentation while Xception-65 is a network with a large parameter space and more likely to fall into local optimumis
EXPERIMENTS The IOEBSS model proposed in this paper includes an efficient binary plane pre-processing, a high precision finetuning semantic segmentation model, and an auxiliary loss function that is more conducive to gradient flow

Summary

INTRODUCTION

Object extraction is a key operation in image processing. It determines the area to be reserved and discarded based on users’ interactive inputs containing a small amount of foreground and background information, enabling users to perform subsequent image processing operations such as image fusion, shape and position editing, etc. The latter uses casual graffiti to mark the foreground and background on the image, and the rest is the area to be divided These semantic segmentation based algorithms have two disadvantages. In order to more accurately understand the interactive input information and realize the migration from semantic segmentation to object extraction, IOEBSS replaces the backbone network with the more versatile ResNet-101. This design enhances the ability to extract semantic information and significantly improves the accuracy of the model.

RELATED WORK

IMPROVED SEMANTIC SEGMENTATION MODEL

LABEL LOSS

EXPERIMENTS

Findings

CONCLUSIONS AND FUTURE WORK