Abstract: This paper presents a novel approach that addresses the challenging task of real-time drivable road region extraction in computer vision. Semantic segmentation, which involves accurately identifying and segmenting objects in real-time data, is a complex problem. However, deep learning has proven to be a powerful technique for achieving semantic segmentation by automatically identifying patterns without the need for explicit programming. To tackle this task, the paper proposes a fusion of the YOLO algorithm and UNET architecture, leveraging their respective strengths. The YOLO algorithm enables high-speed object detection, while the UNET architecture provides advantages in global location utilization, contextual understanding, and performance, even with limited training samples. Importantly, the proposed method is lightweight, making it suitable for deployment on embedded systems with limited computational power. To optimize memory usage and capture context at different scales, the system employs dilated convolutions for efficient feature extraction. The algorithm exhibits exceptional performance in accurately segmenting irregular objects and handles diverse input data types, including images and videos, in real-time. Overall, this paper contributes significantly to the advancement of computer vision technologies and offers a valuable solution for real-time drivable road region extraction. Its potential applications include addressing driving challenges and enhancing safety in autonomous vehicles and intelligent transportation systems.
Read full abstract