Abstract

In this paper an action recognition method that can adaptively handle the problems of variations in camera viewpoint is introduced. Our contribution is three-fold. First, a space-sampling algorithm based on affine transform in multiple scales is proposed to yield a series of different viewpoints from a single one. A histogram of dense optical flow is then extracted over each fixed-size patch for a given generated viewpoint as a local feature descriptor. Second, a dimension selection procedure is also proposed to retain only the dimensions that have distinctive information and discard the unnecessary ones in the feature vector space. Third, to adapt to a situation in which video data in multiple viewpoints are used for training; an extended method with a voting algorithm is also introduced to increase the recognition accuracy. By conducting experiments using both simulated and realistic datasets (http://www.aislab.org/index.php/en/mvar-datasets), the proposed method is validated. The method is found to be accurate and capable of maintaining its accuracy under a wide range of viewpoint changes. In addition, the method is less sensitive to variations in subject scale, subject position, action speed, partial occlusion, and background. The method is also validated by comparing with state-of-the-art view-invariant action recognition methods using well-known i3DPost and MuHAVi public datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.