Abstract
Current research in human action recognition (HAR) focuses on efficient and effective modelling of the temporal features of human actions in 3-dimensional space. Echo State Networks (ESNs) are one suitable method for encoding the temporal context due to its short-term memory property. However, the random initialization of the ESN's input and reservoir weights may increase instability and variance in generalization. Inspired by the notion that input-dependent self-organization is decisive for the cortex to adjust the neurons according to the distribution of the inputs, a Self-Organizing Reservoir Network (SORN) is developed based on Adaptive Resonance Theory (ART) and Instantaneous Topological Mapping (ITM) as the clustering process to cater deterministic initialization of the ESN reservoirs in a Convolutional Echo State Network (ConvESN) and yield a Self-Organizing Convolutional Echo State Network (SO-ConvESN). SORN ensures that the activation of ESN’s internal echo state representations reflects similar topological qualities of the input signal which should yield a self-organizing reservoir. In the context of HAR task, human actions encoded as a multivariate time series signals are clustered into clustered node centroids and interconnectivity matrices by SORN for initializing the SO-ConvESN reservoirs. By using several publicly available 3D-skeleton-based action recognition datasets, the impact of vigilance threshold and reservoir perturbation of SORN in performing clustering, the SORN reservoir dynamics and the capability of SO-ConvESN on HAR task have been empirically evaluated and analyzed to produce competitive experimental results.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have