Abstract

The application of computer vision in transportation engineering has facilitated real-time traffic flow optimization, vehicle counting, anomaly detection, and ameliorated transportation safety. Most vision systems are, however, developed through a supervised learning process, which can be data hungry and costly because it requires manual annotation of objects from a variety of sources. The general rule of thumb for building accurate and transferrable vision models has been to increase the quality, diversity, and quantity of the annotated datasets used in model training. This paper presents a simple, yet efficient active learning framework that significantly reduces the number of annotations needed to build a state-of-the-art vehicle detection and classification model. To achieve this, we first leverage a vision transformer that generates embeddings rich with information needed to quantify the similarity and diversity between images in a two-dimensional embedding space. To select which images from the embedding space should be annotated, we propose a scoring and sampling strategy that minimizes class imbalance and model uncertainty through an iterative process. The latest iteration of the You Only Look Once (YOLO) model, YOLOv8, is used as the active learner. We compare the efficacy of our proposed active learning methods with models developed at much higher sampling rates using the mean average precision. The models developed were also integrated with tracking algorithms to evaluate differences in accuracy for vehicle counts and their practical implications for direction counts.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.