This paper presents an approach for event recognition in sequential images using human body part features and their surrounding context. Key body points were approximated to track and monitor their presence in complex scenarios. Various feature descriptors, including MSER (Maximally Stable Extremal Regions), SURF (Speeded-Up Robust Features), distance transform, and DOF (Degrees of Freedom), were applied to skeleton points, while BRIEF (Binary Robust Independent Elementary Features), HOG (Histogram of Oriented Gradients), FAST (Features from Accelerated Segment Test), and Optical Flow were used on silhouettes or full-body points to capture both geometric and motion-based features. Feature fusion was employed to enhance the discriminative power of the extracted data and the physical parameters calculated by different feature extraction techniques. The system utilized a hybrid CNN (Convolutional Neural Network) + RNN (Recurrent Neural Network) classifier for event recognition, with Grey Wolf Optimization (GWO) for feature selection. Experimental results showed significant accuracy, achieving 98.5% on the UCF-101 dataset and 99.2% on the YouTube dataset. Compared to state-of-the-art methods, our approach achieved better performance in event recognition.
Read full abstract