Automating the identification, localization, and monitoring of roadway assets distributed widely in the roadway network is critical for the traffic management system. It can efficiently provide up-to-date information in supporting transportation asset management. Collecting videos with vehicle-mounted cameras and processing the data with computer vision-based deep learning methods is garnering increased attention from transportation agencies. While promising, challenges arise due to the lack of high-quality annotations for roadway assets in images, difficulties in identifying these assets, and limited solutions. The Segment Anything Model (SAM), a visual foundation model, demonstrates robust zero-shot capability for general image segmentation under various prompts. This study evaluates SAM’s applicability and efficiency in extracting roadway assets from images. Specifically, it examines the impacts of model size and prompt quality on SAM’s performance in segmenting roadway assets. Five state-of-the-art semantic segmentation models are trained and compared with SAM. Results show that a lightweight SAM with human-rendered prompts outperforms the five semantic segmentation models. Based on the evaluation results, future work will explore incorporating SAM into transportation asset management applications, promoting collaboration between human experts and artificial intelligence.
Read full abstract