Abstract
The objective of human action recognition is the interpretation of ongoing events and context from video data for automated systems. In this paper, motion history image (MHI) is used as the region of interest (ROI) of action during the training phase to recognize human actions effectively. Therefore, the extracted spatio-temporal interest points (STIPs), that are used to train the classifier model, are free from noisy interest points due to the clutter background and illumination changes. After extracting the STIPs, the histogram of oriented gradient (HOG) and histogram of optical flow (HOF) features are calculated for the video patches extracted around the STIPs. Action recognition is performed by calculating mutual information of each STIP with respect to all the action classes provided in the training dataset. Mutual information is calculated by using random forest voting. The trees of random forest are built through proposed semi-supervised learning. The tree nodes are split by using unsupervised learning upto a certain predefined depth of the tree by taking the maximum variance of feature differences of the hypothesis. Next, the splitting process of the nodes is carried out by binary error minimization technique based on supervised learning. The experiments are performed on the standard KTH dataset. The performance of proposed technique is 95% which is better compared to earlier reported methods. Further, similar action classes from Weizmann dataset are tested on the same KTH trained forest model and the results are relevantly comparable with the state-of-the-art methods.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have