Mining Semantic Correlations Between Mispredictions and Corrections for Interactive Semantic Segmentation.

Yutong Gao,Chuan-Sheng Foo,Congyan Lang,Yunchao Wei,Yuanzhouhan Cao,Lijuan Sun,Fayao Liu

doi:10.1109/tnnls.2024.3379585

Abstract

Interactive semantic segmentation pursues high-quality segmentation results at the cost of a small number of user clicks. It is attracting more and more research attention for its convenience in labeling semantic pixel-level data. Existing interactive segmentation methods often pursue higher interaction efficiency by mining the latent information of user clicks or exploring efficient interaction manners. However, these works neglect to explicitly exploit the semantic correlations between user corrections and model mispredictions, thus suffering from two flaws. First, similar prediction errors frequently occur in actual use, causing users to repeatedly correct them. Second, the interaction difficulty of different semantic classes varies across images, but existing models use monotonic parameters for all images which lack semantic pertinence. Therefore, in this article, we explore the semantic correlations existing in corrections and mispredictions by proposing a simple yet effective online learning solution to the above problems, named correction-misprediction correlation mining ( CM2 ). Specifically, we leverage the correction-misprediction similarities to design a confusion memory module (CMM) for automatic correction when similar prediction errors reappear. Furthermore, we measure the semantic interaction difficulty by counting the correction-misprediction pairs and design a challenge adaptive convolutional layer (CACL), which can adaptively switch different parameters according to interaction difficulties to better segment the challenging classes. Our method requires no extra training besides the online learning process and can effectively improve interaction efficiency. Our proposed CM2 achieves state-of-the-art results on three public semantic segmentation benchmarks.

Full Text