Abstract

Video semantic segmentation (VSS) is a fundamental machine vision task with various practical applications, such as autonomous driving and automated surveillance. Current studies mainly utilize temporal features based on optical flow and the self-attention mechanism to improve VSS accuracy. However, these studies still face challenges like reduced accuracy and computational overhead due to inaccurate optical flow and the computational cost of the self-attention mechanism. To solve these problems, we propose a Temporal Difference Segmentation Net (TDSNet). Additionally, to improve accuracy and keep low computational costs, TDSNet employs temporal-difference-based temporal feature through the Temporal Feature Refine Module (TFRM). To further improve the accuracy of VSS, TDSNet adaptively fuses temporal features of varied motion magnitude with the Motion Magnitude Refine Module (MMRM). This module weighs and fuses temporal features of different magnitudes between frames. Extensive experimental results demonstrate that the comprehensive performance of TDSNet outperforms that of State-Of-The-Art (SOTA) VSS models on two large-scale public datasets: VSPW and Cityscapes. For instance, on VSPW, TDSNet achieves higher faster FPS than the SOTA model CFFM++ does by 13.2 frames per second, while the mIoU of the proposed model is only 0.7% lower than that of CFFM++. These results indicate promising performance in VSS applications.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.