Abstract

AbstractHuman action recognition (HAR) in untrimmed videos can make insightful predictions of human behaviour. Previous work on HAR‐included models trained on spatial and temporal annotations and could classify limited actions from trimmed videos. These methods reported limitations such as (1) performance degradation due to the lack of precision temporal regions proposal and (2) poor adaptability of the models in the clinical domain because of unrelated actions of interest. We propose an innovative method that could analyse untrimmed behavioural videos to recommend actions of interest leading to diagnostic and functional assessments for children with Autism Spectrum Disorder (ASD). Our method entails end‐to‐end behaviour action recognition (BAR) pipeline, including child detection, temporal action localization, and actions of interest identification and classification. The model trained on the data of 400 ASD children and 125 with other developmental delays (ODD) accurately identified ASD, ODD, and Neurotypical children with 79.7%, 77.2%, and 80.8% accuracy, respectively. The model's performance on an independent benchmark Self‐Stimulatory Behaviour Dataset (SSBD) reported top‐1 accuracy of 78.57% for combined localization with action recognition, significantly higher than the earlier reported outcomes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call