Abstract

This paper proposes new features extracted from images derived from flow, for first-person activity recognition. Features from convolutional neural network (CNN), which is designed for 2D images, attract attention from computer vision researchers due to its powerful discrimination capability, and recently a convolutional neural network for videos, called C3D (Convolutional 3D), was proposed. Generally CNN / C3D features are extracted directly from original images / videos with pre-trained convolutional neural network, since the network was trained with images / videos. In this paper, on the other hand, we propose the use of images derived from flow (we call this image as optical flow image) as input images into the pre-trained neural network, based on the following reasons; (i) flow images give dynamic information which is useful for activity recognition, compared with original images, which give only static information, and (ii) the pre-trained network has chance to extract features with reasonable discrimination capability, since the network was trained with huge amount of images from big categories. We carry out experiments with a dataset named DogCentric Activity Dataset, and we show the effectiveness of the extracted features.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call