Abstract

In the contemporary era, modern public buses are equipped with multiple built-in cameras, offering real-time or recorded monitoring capabilities through centralized systems. This research endeavors to introduce a novel approach for classifying driver behaviors at the signalized road intersections with traffic lights by leveraging a public bus vehicular camera that encompasses the road and thus traffic lights. The proposed method involves a 2-step multi-head self-attention network that employs the extracted traffic light state, and optic flow information besides the raw vehicular camera video. Subsequently, a bi-directional long short-term memory based spatio-temporal aggregation strategy is applied to guide the learning final discriminative representation. To evaluate the proposed approach, we introduce a benchmark dataset called BusEye, which covers the challenging real-life scenarios that are captured on actual vehicular cameras mounted on public busses. We provide extensive analysis and ablation studies on the BusEye dataset that shows our proposed solution outperforms the state-of-the-art video classification methods both on a per-class basis and overall. Based on the outcomes derived from the validation and test data, the proposed method demonstrated remarkable efficacy in classifying driver behaviors at signalized intersections.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call