Addressing the issues of low real-time performance and high false positive rates in driver fatigue detection methods based on deep learning, this paper proposes a temporal sequence Transformer-based fatigue detection method grounded in the localization of facial landmarks in drivers. Initially, the facial positions are obtained using the single-stage face detection algorithm RetinaFace. Subsequently, a lightweight GM module is designed as the principal feature extraction module for constructing a multi-scale fusion facial landmark detection network, and facial fatigue feature parameters based on temporal sequences are calculated according to the facial landmarks. Finally, a fatigue driving classification method based on temporal sequences Transformers is developed for classifying the sequences of fatigue feature parameters. Experimental results demonstrate that the inference time required for facial detection and landmark detection is merely 16.8 milliseconds, with the per-frame inference time for facial landmark detection being only 2.5 milliseconds, thus fulfilling the real-time requirements of fatigue driving detection during the feature extraction phase. A temporal sequence of fatigue feature parameters built on the NTHU-DDD dataset and the trained temporal sequence Transformer model resulted in an accuracy rate of 91.4% for the proposed method.
Read full abstract