Abstract

Incremental learning is a topic of great interest in the current state of machine learning research. Real-world problems often require a classifier to incorporate new knowledge while preserving what was learned before. One of the most challenging problems in computer vision is Human Action Recognition (HAR) in videos. However, most of the existing works approach HAR from a non-incremental point of view. This work proposes a framework for performing HAR in the incremental learning scenario called Incremental Human Action Recognition with Dual Memory (IHAR-DM). IHAR-DM contains three main components: a 3D convolutional neural network for capturing Spatio-temporal features; a Triplet Network to perform metric learning; and the dual-memory Extreme Value Machine, which is introduced in this work. The proposed method is compared with 10 other state-of-the-art incremental learning models. We propose five experimental settings containing different numbers of tasks and classes using two widely known HAR datasets: UCF-101 and HMDB51. Our results show superior performance in terms of Normalized Mutual Information (NMI) and Inter-task Intransigence (ITI), which is a new metric proposed in this work. Overall results show the feasibility of the proposal for real HAR problems, which mostly present the requirements imposed by incremental learning.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call