Functional near-infrared spectroscopy (fNIRS) is beneficial for studying brain activity in naturalistic settings due to its tolerance for movement. However, residual motion artifacts still compromise fNIRS data quality and might lead to spurious results. Although some motion artifact correction algorithms have been proposed in the literature, their development and accurate evaluation have been challenged by the lack of ground truth information. This is because ground truth information is time- and labor-intensive to manually annotate. This work investigates the feasibility and reliability of a deep learning computer vision (CV) approach for automated detection and annotation of head movements from video recordings. Fifteen participants performed controlled head movements across three main rotational axes (head up/down, head left/right, bend left/right) at two speeds (fast and slow), and in different ways (half, complete, repeated movement). Sessions were video recorded and head movement information was obtained using a CV approach. A 1-dimensional UNet model (1D-UNet) that detects head movements from head orientation signals extracted via a pre-trained model (SynergyNet) was implemented. Movements were manually annotated as a ground truth for model evaluation. The model's performance was evaluated using the Jaccard index. The model showed comparable performance between the training and test sets (J train = 0.954; J test = 0.865). Moreover, it demonstrated good and consistent performance at annotating movement across movement axes and speeds. However, performance varied by movement type, with the best results being obtained for repeated (J test = 0.941), followed by complete (J test = 0.872), and then half movements (J test = 0.826). This study suggests that the proposed CV approach provides accurate ground truth movement information. Future research can rely on this CV approach to evaluate and improve fNIRS motion artifact correction algorithms.
Read full abstract