Abstract

We analyze the relation of video complexity with the performance of Human Action Recognition (HAR) algorithms. The rationale behind this is that variations in image conditions (e.g. occlusion, camera movement, resolution, and illumination), and image content (e.g. edge density, and number of objects), both depicting scene complexity increase the difficulty to recognize activities for a computing model. The HAR algorithms used in this work are improved Dense Trajectories (iDT) [25], Motion-Augmented RGB Stream for Action Recognition (MARS) [5], and SlowFast [7] compared with the number of people and objects in the scene and to three statistical measures: entropy, number of regions and edge density. The results so far show a correlation between complexity and the classification performance. Mask-RCNN simulation for counting elements was carried in the supercomputer cluster of LSC-INAOE.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call