Gaze zone detection involves estimating where drivers look in terms of broad categories (e.g., left mirror, speedometer, rear mirror). We here specifically focus on the automatic annotation of gaze zones in the context of road safety research, where the system can be tuned to specific drivers and driving conditions, so that an easy to use but accurate system may be obtained. We show with an existing dataset of eye region crops (nine gaze zones) and two newly collected datasets (12 and 10 gaze zones) that image classification with YOLOv8, which has a simple command line interface, achieves near-perfect accuracy without any pre-processing of the images, as long as a model is trained on the driver and conditions for which annotation is required (such as whether the drivers wear glasses or sunglasses). We also present two apps to collect the training images and to train and apply the YOLOv8 models. Future research will need to explore how well the method extends to real driving conditions, which may be more variable and more difficult to annotate for ground truth labels.
Read full abstract