Dense Crosstalk Feature Aggregation for Classification and Localization in Object Detection

Yuanwei Li,En Zhu,Jiyong Tan,Li Shen,Hang Chen

doi:10.1109/tcsvt.2022.3218880

Abstract

The misalignment between classification and localization is a significant performance improvement point for object detection. To cope with the misalignment problem, more attempts have been made to separate different tasks (e.g., Classification, Bounding Box Regression) by introducing extra heads, which emphasizes the separation of multiple tasks to cope with their variability. In this paper, we consider that both separation and crosstalk are important between classification and localization. Considering that the two types of tasks are different and have different regions and features of interest, they are in conflict with each other and therefore need to be separated. However, they also need to be fused, because classification and localization are, after all, about understanding the same object. To realize this idea, we introduce bidirectional crosstalk detection head in a systematic manner to provide a full deep cross-fusion between classification and localization. To our best knowledge, it is the first time that full bidirectional crosstalk is introduced between classification and localization for one-stage detector. Extensive experiments are conducted to demonstrate the effectiveness of the proposed method. With a ResNet-50 backbone, our method can significantly improve the GFLV1 baseline by 2.0 AP with similar inference speed (18.5 fps vs. 18.3 fps) and further boost GFLV1 with a big margin (4.3 AP) by increasing our model size. Fair comparisons also show that the proposed head outperforms state-of-the-art heads (T-Head, DyHead) with comparable or faster inference speed under the same ATSS baseline model. With a Res2Net-DCN backbone, our model achieves 51.7 AP at single-model single-scale testing. The code and pretrained models will be made publicly available.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Dense Crosstalk Feature Aggregation for Classification and Localization in Object Detection

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society

Lead the way for us

Journal: IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society	Publication Date: Jun 1, 2023
Citations: 1

Similar Papers

Guided Refine-Head for Object Detection
... You Song
-
, et. al. ... You Song
24 Dec 2019
24 Dec 2019

Improved Oriented Object Detection in Remote Sensing Images Based on a Three-Point Regression Method
Falin Wu ... Xiaohong Sui
Remote sensing | VOL. 13
Falin Wu, et. al.Falin Wu ... Xiaohong Sui
10 Nov 2021
Remote sensing | VOL. 13

YOLOv4-5D: An Effective and Efficient Object Detector for Autonomous Driving
Yingfeng Cai ... Long Chen
IEEE Transactions on Instrumentation and Measurement | VOL. 70
Yingfeng Cai, et. al.Yingfeng Cai ... Long Chen
01 Jan 2020
IEEE Transactions on Instrumentation and Measurement | VOL. 70

Corner-Point and Foreground-Area IoU Loss: Better Localization of Small Objects in Bounding Box Regression.
Delong Cai ... Zhaoyun Zhang
Sensors (Basel, Switzerland) | VOL. 23
Delong Cai, et. al.Delong Cai ... Zhaoyun Zhang
22 May 2023
Sensors (Basel, Switzerland) | VOL. 23

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Dense Crosstalk Feature Aggregation for Classification and Localization in Object Detection

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on circuits and systems for video technology : a publication of the Circuits and Systems Society