Abstract

Recently, deformable convolution networks have shown the superior performance in object detection due to its ability to adapt to the geometric variations of object. These methods learn the offset fields under the supervision of localization and recognition. Nevertheless, the spatial support of these networks may be inexact because the offsets are learned implicitly via extra convolutional layer. In this work, we present curvature-driven deformable convolutional networks (C-DCNets) that adopt explicit geometric property of the preceding feature maps to enhance the deformability of convolution operation and make the networks easier to focus on pertinent image region. To be consistent with postprocessing technology of object detection, we multiply the class prediction probability by the similarity of predicted boxes and ground truth boxes as the final class prediction probability and substitute it into the binary cross entropy loss function. The obtained loss function correlates the bounding box regression and classification. Experimental results on PASCAL VOC and COCO data set show that C-DCNets-based YOLOv4 with the proposed loss function outperforms state-of-the-art algorithms.

Highlights

  • Attention mechanisms make a neural network pay more attention to relevant parts of the image than irrelevant parts.erefore, they can model long-range dependencies

  • In order to further improve the deformation ability of Deformable Convolutional Networks, we introduce the intrinsic geometric property of the input feature maps, and a curvaturedriven deformable convolutional networks (C-DCNets) are proposed, which use the offset learning guided by curvature fields of the preceding feature maps to focus the network on pertinent image region. e proposed method produces leading results on PASCAL VOC and COCO data set for object detection

  • To strength the deformability of convolution operation under irregular sampling grid, we propose curvature-driven deformable convolutional networks (C-DCNets) based on explicit geometric property of the preceding feature maps, where the curvature fields are utilized to guide the offsets learning, and the proposed C-DCNets modules are learned under the supervision of loss function that correlates the position accuracy and the class prediction probability

Read more

Summary

Introduction

Attention mechanisms make a neural network pay more attention to relevant parts of the image than irrelevant parts. (1) Curvature-driven deformable convolutional networks (C-DCNets) are proposed, which make the spatial support of the networks adapt much more to saliency region (2) A new loss function associated with bounding box regression and classification is proposed, in which the class prediction probability in the binary cross entropy loss function integrates the similarity of predicted boxes and targets (3) We evaluate a C-DCNets based detection frameworks with the proposed loss function on PASCAL VOC and COCO data set, against a very competitive Faster R-CNN [12], YOLOv4 [17], DETR [22], and deformable DETR [23] baseline e rest of this paper is organized as follows: In Section 2, the related works of attention mechanisms and postprocessing techniques are reviewed.

Related Work
Experimental Results
Method
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.