Abstract
Automatically inferring ongoing activities is to enable the early recognition of unfinished activities, which is quite meaningful for applications, such as online human-machine interaction and security monitoring. Stateof-the-art methods use the spatio-temporal interest point (STIP) based features as the low-level video description to handle complex scenes [1, 2, 3]. While the existing problem is that typical bag-of-visual words (BoVW) focuses on feature distribution but ignores the inherent contexts in sequences, resulting in low discrimination when directly dealing with limited observations. To solve this problem, the Recurrent SelfOrganizing Map (RSOM) [4], which was designed to process sequential data, is novelly adopted in this paper for the high-level representation of ongoing activities. The innovation lies that observed features and their spatio-temporal contexts are encoded in a trajectory of the pre-trained RSOM units. Additionally, a combination of Dynamic Time Warping (DTW) distance and Edit distance, named DTW-E, is specially proposed to measure the structural dissimilarity between RSOM trajectories. RSOM Trajectory: Since the RSOM constitutes a direct extension of SOM, we start from SOM. SOM is to map the data from an input space VI onto a lower dimensional space VL (a map) in such way that the topological relationships in VI are preserved and the SOM units approximate closely the probability density function of VI . Suppose each unit i in SOM is associated with a weight vector wi = [wi1,wi2, ...,win] ∈ Rn with the same dimension as the input vector x = [x1,x2, ...,xn] ∈ Rn. Learning process that leads to self-organization on a map can be summarized as, (i) The feature vector x(t) is input, then its best matching unit (bmu) on the map is found by computing the minimum distance as:
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.