Abstract
The significance of Driver Distraction Detection (DDD) in enhancing road safety is immense and undeniable. While numerous DDD methods concentrate on conventional distractions, such as driving behavior, gaze direction, or hand movements, they often overlook cognitive distractions owing to their intricate nature. This study aims to bridge this gap by leveraging driver eye movement data to detect cognitive distractions. To achieve this, we introduce the Driver Cognitive Distraction Detection (DCDD) model. This model incorporates the driver’s gaze points, captured by an eye tracker, and the DashCam Image (DCI) as inputs. To capture the temporal dynamics of the driver’s eye movements, we devise a preprocessing technique that generates an eye movement heat Map (Map), which highlights the driver’s areas of focus within the DCI. Then, a Fusion Adversarial Network (FAN) is designed by utilizing the Sigmoid activation function to blend DCI features with those of the Map. Subsequently, a shared parameter network is employed for adversarial learning, focusing on the inversely activated features. To further achieve the extraction of DCI and Map features, we propose the Multi-View Space Channel Network (MSCN). This network can integrate low-dimensional spatial features and high-dimensional channel features from different views of DCI and Map to help Transformer capture DCI and Map features more comprehensively in the temporal dimension. By effectively integrating DCI and Map data and recursively extracting time information, our DCDD model achieves remarkable accuracy while controlling model Params and FLOPs, attaining a high score of 96.42%.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have