Abstract

Event-based cameras are bio-inspired sensors capturing asynchronous per-pixel brightness changes (events), which have the advantages of high temporal resolution and low power consumption compared with traditional frame-based cameras. Despite recent progress in human action recognition on traditional cameras, few solutions on event cameras have been proposed due to its unconventional frameless output. To perform low-power human action recognition on event camera, we propose a multipath deep neural network for action recognition based on event camera outputs. Specifically, a fixed number of asynchronous events are accumulated to form frames for feature extraction. With events encoding dynamic information, we estimate human pose from event frames to encode static information to improve recognition accuracy. The complementary properties between dynamic event frames and static human poses are jointly explored and fused to predict action. Extensive experiments verify the effectiveness of the proposed model with a recognition accuracy of 85.91% on the DHP19 dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call