Abstract

Semantic segmentation has been one of the essential tasks in computer vision. More complicated and computationally intensive mechanisms are integrated into segmentation models to get more accurate results, leading to increased processing time and resource usage. Based on the idea that pixels with similar high-level features are more likely to have similar semantic labels, we propose a computationally efficient mechanism named Hierarchical Semantic Broadcasting (HSB), which can infer earlier-stage semantic label maps from lower-level feature maps by referring to semantic label maps of higher-level feature maps. Since lower-level feature maps have higher resolution and richer context, HSB can provide additional details for better semantic segmentation. By integrating HSB into a general-purpose light network, we propose Hierarchical Semantic Broadcasting Network (HSB-Net) for real-time semantic segmentation, which achieves a good trade-off between accuracy and speed. Evaluated by the Cityscapes dataset, HSB-Net can run at 123.7 FPS for a 512 <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$ \boldsymbol{\times }$</tex-math></inline-formula> 1024 input on a single GeForce RTX 2080 Ti card while achieving 73.1% mean IoU.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.