Abstract
The dual attention module is a potent semantic segmentation technique renowned for its capabilities, yet it often faces significant computational demands and GPU memory usage. To tackle these challenges, we introduce an advanced dual perception network comprising two modules: A streamlined Multi-scale Efficient Position Attention Module (MEPAM) and an optimized Efficient Channel Attention Module (MECAM). MEPAM incorporates multi-scale global average pooling into the Position Attention Module (PAM), substantially cutting computational overhead and memory consumption without compromising performance. Meanwhile, MECAM integrates compressed convolutions into the Channel Attention Module (CAM), improving segmentation accuracy and inference speed compared to conventional methods like DANet. Our approach underwent comprehensive evaluation on a semantic segmentation benchmark dataset, showcasing superior performance. For instance, on the Cityscapes dataset, our method achieves an IoU of 82.2%. In terms of efficiency gains, MEPAM operates nearly 1.97 times faster than the standard PAM module on GPU, while requiring 7.55 times less memory with a [Formula: see text] input. Similarly, MECAM achieves approximately 2.2 times faster processing than CAM, while cutting GPU memory usage by 7.53 times. This innovative dual perception network not only enhances segmentation accuracy and speed but also addresses the computational challenges associated with traditional dual attention modules.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.