Abstract

Lower versions of EfficientDet (such as D0, D1) have smaller network structures and parameter sizes, but lower detection accuracy. Higher versions exhibit higher accuracy, but the increase in network complexity poses challenges for real-time processing and hardware requirements. To meet the higher accuracy requirements under limited computational resources, this paper introduces SpanEffiDet based on the channel adaptive frequency filter (CAFF) and the Span-Path Bidirectional Feature Pyramid structure. Firstly, the CAFF module proposed in this paper realizes the frequency domain transformation of channel information through Fourier transform and effectively extracts the key features through semantic adaptive frequency filtering, thus, eliminating channel redundant information of EfficientNet. Simultaneously, the module has the ability to compute the weights across the channels and at fine granularity, and capture the detailed information of element features. Secondly, a two-way characteristic pyramid network multi-level cross-BIFPN, which can achieve multi-layer and multi-nodes, is proposed to build cross-level information transmission to incorporate both semantic and positional information of the target. This design enables the network to more effectively detect objects with significant size differences in complex environments. Finally, by introducing generalized focal Loss V2, reliable localization quality estimation scores are predicted from the distribution statistics of bounding boxes, thereby improving localization accuracy. The experimental results indicate that on the MS COCO dataset, SpanEffiDet-D0 achieved an AP improvement of 3.3% compared to the original EfficientDet series algorithms. Similarly, on the PASCAL VOC2007 and 2012 datasets, the mAP of SpanEffiDet-D0 is respectively 1.66 and 2.65% higher than that of EfficientDet-D0.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.