GLMDriveNet: Global–local Multimodal Fusion Driving Behavior Classification Network

Wenzhuo Liu,Yan Gong,Guoying Zhang,Jianli Lu,Yunlai Zhou,Junbin Liao

doi:10.1016/j.engappai.2023.107575

Abstract

Driving behavior classification plays an important role in many fields, such as Advanced Driving Assistance System (ADAS), traffic safety, and energy saving. In this paper, we propose a Global–local Multimodal Fusion Driving Behavior Classification Network (GLMDriveNet) which classifies driver behaviors into normal driving, aggressive driving, and drowsy driving. First of all, we design a Global–local Interaction Channel Attention Module (GLI-CAM) to extract effective features in both the roadside image and the spectrogram generated from the current prediction time and its previous four seconds of vehicle speeds. Furthermore, a learnable positional embedding is introduced to fuse the global and local information of the channels for better screening of the extracted features. Secondly, we propose a Multi-scale Feature Representation Fusion Module (MS-FRFM) to associate the high-scale and low-scale information of images and spectrograms and assign different importances for different modal information, making the network more inclined to useful modal information. Our model is evaluated on a public dataset UAH-DriveSet and achieves the best performance (98.4% F1-score on all roads, 97.4% F1-score on the motorway road, and 99.8% F1-score on the secondary road) compared to other state-of-the-art methods. Our model has a very fast speed (142 FPS) and strong generalization which has been verified through extensive experiments on multiple datasets. The code is available on https://github.com/liuwenzhuo1/GLMDrivenet.

Full Text