Supervised Object Localization Research Articles

Recently, deep learning-based underwater object detection technology has achieved remarkable success. However, the accuracy and completeness of dataset instance annotation are crucial for its success. The quality of underwater images is low, severe objects clustering, and occlusion, acquiring object's annotations demands substantial time and labor costs, while mis annotation and missed annotation can also degrade model performance and limit their application in practical scenarios. To address this issue, this paper presents a novel weakly supervised underwater object real-time detection method, which is divided into two subtasks: weakly supervised object localization and real-time object detection. In the weakly supervised object localization task, we design a novel category hierarchy structure network that integrates the high-resolution attention-class activation mapping algorithm to obtain high-quality object class activation maps, weaken background interference, and obtain more complete object regions. The parameterized spatial loss module is devised to enable the model to escape from local optimal solutions, thus accurately and efficiently obtaining object pseudo-detection annotation boxes. For the real-time object detection task, the single-stage detector YOLOv7 is selected as the basic detection model, and an object perception loss function is designed based on the class activation map to jointly supervise the training process. A method for filtering noisy pseudo-supervision information is proposed to enhance the pseudo-supervision information involved in training. Ablation experiments and multi-method comparison experiments were conducted on the URPC and RUOD datasets, and the results verify the effectiveness of the proposed strategy, and our model exhibits significant advantages in detection performance and detection efficiency compared to current mainstream and advanced models.

Read full abstract

Weakly supervised object localization (WSOL), which trains object localization models using solely image category annotations, remains a challenging problem. Existing approaches based on convolutional neural networks (CNNs) tend to miss full object extent while activating discriminative object parts. Based on our analysis, this is caused by CNN's intrinsic characteristics, which experiences difficulty to capture object semantics at long distances. In this article, we introduce the vision transformer to WSOL, with the aim to capture long-range semantic dependency of features by leveraging transformer's cascaded self-attention mechanism. We propose the token semantic coupled attention map (TS-CAM) method, which first decomposes class-aware semantics and then couples the semantics with attention maps for semantic-aware activation. To capture object semantics at long distances and avoid partial activation, TS-CAM performs spatial embedding by partitioning an image to a set of patch tokens. To incorporate object category information to patch tokens, TS-CAM reallocates category-related semantics to each patch token. The patch tokens are finally coupled with attention maps which are semantic-agnostic to perform semantic-aware object localization. By introducing semantic tokens to produce semantic-aware attention maps, we further explore the capability of TS-CAM for multicategory object localization. Experiments show that TS-CAM outperforms its CNN-CAM counterpart by 11.6% and 28.9% on ILSVRC and CUB-200-2011 datasets, respectively, improving the state-of-the-art with large margins. TS-CAM also demonstrates superiority for multicategory object localization on the Pascal VOC dataset. The code is available at github.com/yuanyao366/ts-cam-extension.

Read full abstract

Supervised Object Localization Research Articles

Related Topics

Articles published on Supervised Object Localization

Weakly Supervised Underwater Object Real-time Detection Based on High-resolution Attention Class Activation Mapping and Category Hierarchy

Semantic-Constraint Matching for transformer-based weakly supervised object localization

TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised Object Localization.

PCSformer: Pair-wise Cross-scale Sub-prototypes mining with CNN-transformers for weakly supervised semantic segmentation

Clustering-inspired channel selection method for weakly supervised object localization

Generalized Weakly Supervised Object Localization.

Feature disparity learning for weakly supervised object localization

Anti-Adversarially Manipulated Attributions for Weakly Supervised Semantic Segmentation and Object Localization.

Weakly Supervised Object Localization with Background Suppression Erasing for Art Authentication and Copyright Protection

Multi-Layer Decoupling Attention Network for Weakly Supervised Object Localization

Re-perceive Global Vision of Transformer for Remote Sensing Images Weakly Supervised Object Localization

Localizing From Classification: Self-Directed Weakly Supervised Object Localization for Remote Sensing Images.

Exploring Intrinsic Discrimination and Consistency for Weakly Supervised Object Localization.

Learning Local Semantic Region Activations for Weakly Supervised Object Localization

Boosting Weakly Supervised Object Localization and Segmentation With Domain Adaption.

Adaptive Zone Learning for Weakly Supervised Object Localization.

Weakly supervised object localization via knowledge distillation based on foreground–background contrast

Background-Aware Classification Activation Map for Weakly Supervised Object Localization.

Module of Axis-based Nexus Attention for weakly supervised object localization

Integration of Multi-scale CAM and Attention for Weakly Supervised Defects Localization on Surface Defective Apple

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Supervised Object Localization Research Articles

Related Topics

Articles published on Supervised Object Localization

Weakly Supervised Underwater Object Real-time Detection Based on High-resolution Attention Class Activation Mapping and Category Hierarchy

Semantic-Constraint Matching for transformer-based weakly supervised object localization

TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised Object Localization.

PCSformer: Pair-wise Cross-scale Sub-prototypes mining with CNN-transformers for weakly supervised semantic segmentation

Clustering-inspired channel selection method for weakly supervised object localization

Generalized Weakly Supervised Object Localization.

Feature disparity learning for weakly supervised object localization

Anti-Adversarially Manipulated Attributions for Weakly Supervised Semantic Segmentation and Object Localization.

Weakly Supervised Object Localization with Background Suppression Erasing for Art Authentication and Copyright Protection

Multi-Layer Decoupling Attention Network for Weakly Supervised Object Localization

Re-perceive Global Vision of Transformer for Remote Sensing Images Weakly Supervised Object Localization

Localizing From Classification: Self-Directed Weakly Supervised Object Localization for Remote Sensing Images.

Exploring Intrinsic Discrimination and Consistency for Weakly Supervised Object Localization.

Learning Local Semantic Region Activations for Weakly Supervised Object Localization

Boosting Weakly Supervised Object Localization and Segmentation With Domain Adaption.

Adaptive Zone Learning for Weakly Supervised Object Localization.

Weakly supervised object localization via knowledge distillation based on foreground–background contrast

Background-Aware Classification Activation Map for Weakly Supervised Object Localization.

Module of Axis-based Nexus Attention for weakly supervised object localization

Integration of Multi-scale CAM and Attention for Weakly Supervised Defects Localization on Surface Defective Apple