Object Detection of Road Assets Using Transformer-Based YOLOX with Feature Pyramid Decoder on Thai Highway Panorama

Teerapong Panboonyuen,Sittinun Thongbai,Phisan Santitamnont,Kittiwan Suphan,Chaiyut Charoenphon,Weerachai Wongweeranimit

doi:10.3390/info13010005

Teerapong Panboonyuen, Sittinun Thongbai + Show 4 more

Open Access

https://doi.org/10.3390/info13010005

Copy DOI

Journal: Information	Publication Date: Dec 25, 2021
Citations: 12	License type: CC BY 4.0

Affiliation: Chulalongkorn University

Abstract

Due to the various sizes of each object, such as kilometer stones, detection is still a challenge, and it directly impacts the accuracy of these object counts. Transformers have demonstrated impressive results in various natural language processing (NLP) and image processing tasks due to long-range modeling dependencies. This paper aims to propose an exceeding you only look once (YOLO) series with two contributions: (i) We propose to employ a pre-training objective to gain the original visual tokens based on the image patches on road asset images. By utilizing pre-training Vision Transformer (ViT) as a backbone, we immediately fine-tune the model weights on downstream tasks by joining task layers upon the pre-trained encoder. (ii) We apply Feature Pyramid Network (FPN) decoder designs to our deep learning network to learn the importance of different input features instead of simply summing up or concatenating, which may cause feature mismatch and performance degradation. Conclusively, our proposed method (Transformer-Based YOLOX with FPN) learns very general representations of objects. It significantly outperforms other state-of-the-art (SOTA) detectors, including YOLOv5S, YOLOv5M, and YOLOv5L. We boosted it to 61.5% AP on the Thailand highway corpus, surpassing the current best practice (YOLOv5L) by 2.56% AP for the test-dev data set.

Highlights

Identifying road asset objects in Thailand highway monitoring image sequences is essential for intelligent traffic monitoring and administration of the highway
The results proved that our Transformer-Based YOLOX with Feature Pyramid Network (FPN)
Our proposed YOLOX with Vision Transformer and FPN method reaches the highest performance on Average Precision (AP) rating at 61.15% in the testing set

Summary

Introduction

Identifying road asset objects in Thailand highway monitoring image sequences is essential for intelligent traffic monitoring and administration of the highway. With the widespread use of traffic surveillance cameras, an extensive library of traffic video footage has been available for examination. A more distant road surface may usually be evaluated from an eye-observing angle. At this viewing angle, the vehicle’s object size varies enormously, and the detection accuracy of a small item far away from the road is low. In the face of complicated camera scenarios, it’s critical to address and implement the difficulties listed above successfully. This study applies the object detection findings for multi-object tracking and asset object counting, including kilometer signs (marked as KM Sign) and kilometer stones (marked as KM Stone)

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Object Detection of Road Assets Using Transformer-Based YOLOX with Feature Pyramid Decoder on Thai Highway Panorama

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information

Lead the way for us

Similar Papers

Transformer-Based Decoder Designs for Semantic Segmentation on Remotely Sensed Images
Teerapong Panboonyuen ... Kulsawasd Jitkajornwanich
Remote Sensing | VOL. 13
Teerapong Panboonyuen, et. al.Teerapong Panboonyuen ... Kulsawasd Jitkajornwanich
15 Dec 2021
Remote Sensing | VOL. 13

A2-FPN: Attention Aggregation based Feature Pyramid Network for Instance Segmentation
Miao Hu ... Shengjin Wang
-
Miao Hu, et. al.Miao Hu ... Shengjin Wang
01 Jun 2021
01 Jun 2021

Multi-Task Text Classification using Graph Convolutional Networks for Large-Scale Low Resource Language
Mounika Marreddy ... Subba Reddy Oota
-
Mounika Marreddy, et. al.Mounika Marreddy ... Subba Reddy Oota
18 Jul 2022
18 Jul 2022

Double Feature Pyramid Networks for Classification and Localization on Object Detection
Qi Yang ... Yu Xiao
-
Qi Yang, et. al.Qi Yang ... Yu Xiao
09 Oct 2022
09 Oct 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Object Detection of Road Assets Using Transformer-Based YOLOX with Feature Pyramid Decoder on Thai Highway Panorama

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information