Adjustable patch and feature prior token-based transformer for weakly supervised semantic segmentation

Linjuan Li,Haoxue Zhang,Gang Xie,Yanhong Bai

doi:10.1080/1206212x.2024.2333122

Abstract

Weakly supervised semantic segmentation is a challenging task, utilizing only low-cost weak supervision to produce pixel-level predictions. Existing transformer-based methods for weakly supervised semantic segmentation have some limitations: (1) Fixing patch size might destroy the structured semantics, which is unfriendly to objects of different scales, and (2) Ignoring the prior features when using multi-head attention mechanisms might lead to inaccurate segmentation localization. To tackle these issues, we proposed an effective transformer framework coupled with the adjustable patch and prior feature tokens, termed as APFPformer, in which an adjustable patch module is developed to split the image according to the area of the salient object for preserving the structured semantics in patches. A prior feature token module is devised to exploit the edge and texture information as prior tokens, ensuring the gain of discriminative representation. Additionally, a single-stage scheme is applied to reduce the computation and rapid segmentation process. Our experiments demonstrate the superiority of our approach over early methods, gaining competitive mean Intersection-over-Union scores of 67.9% on the PASCAL VOC2012 dataset and 40.2% on the MS COCO dataset.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Adjustable patch and feature prior token-based transformer for weakly supervised semantic segmentation

Abstract

Talk to us

Similar Papers

More From: International Journal of Computers and Applications

Lead the way for us

Similar Papers

Single-Shot Object Detection with Split and Combine Blocks
Hongwei Wang ... Zhaoyang Wang
Applied Sciences | VOL. 10
Hongwei Wang, et. al.Hongwei Wang ... Zhaoyang Wang
13 Sep 2020
Applied Sciences | VOL. 10

Learning Richer Features in Deep CNN for Object Detection
Yi Li ... Xiaowei He
-
Yi Li, et. al.Yi Li ... Xiaowei He
01 Oct 2020
01 Oct 2020

A Position-Aware Transformer for Image Captioning
Zelin Deng ... Osama Alfarraj
Computers, Materials & Continua | VOL. 70
Zelin Deng, et. al.Zelin Deng ... Osama Alfarraj
01 Jan 2021
Computers, Materials & Continua | VOL. 70

Enhanced Feature Pyramid Networks by Feature Aggregation Module and Refinement Module
Xuan-Thuy Vo ... Kang-Hyun Jo
-
Xuan-Thuy Vo, et. al.Xuan-Thuy Vo ... Kang-Hyun Jo
01 Jun 2020
01 Jun 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Adjustable patch and feature prior token-based transformer for weakly supervised semantic segmentation

Abstract

Talk to us

Similar Papers

More From: International Journal of Computers and Applications