Hierarchical Semantic Broadcasting Network for Real-Time Semantic Segmentation

Genling Li,Liang Li,Jiawan Zhang

doi:10.1109/lsp.2021.3129418

Abstract

Semantic segmentation has been one of the essential tasks in computer vision. More complicated and computationally intensive mechanisms are integrated into segmentation models to get more accurate results, leading to increased processing time and resource usage. Based on the idea that pixels with similar high-level features are more likely to have similar semantic labels, we propose a computationally efficient mechanism named Hierarchical Semantic Broadcasting (HSB), which can infer earlier-stage semantic label maps from lower-level feature maps by referring to semantic label maps of higher-level feature maps. Since lower-level feature maps have higher resolution and richer context, HSB can provide additional details for better semantic segmentation. By integrating HSB into a general-purpose light network, we propose Hierarchical Semantic Broadcasting Network (HSB-Net) for real-time semantic segmentation, which achieves a good trade-off between accuracy and speed. Evaluated by the Cityscapes dataset, HSB-Net can run at 123.7 FPS for a 512 <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$ \boldsymbol{\times }$</tex-math></inline-formula> 1024 input on a single GeForce RTX 2080 Ti card while achieving 73.1% mean IoU.

Full Text