In recent years, there has been a proliferation of weakly supervised methods in the field of video anomaly detection. Despite significant progress in existing research, these efforts have primarily focused on addressing this issue within Euclidean space. Conducting weakly supervised video anomaly detection in Euclidean space imposes a fundamental limitation by constraining the ability to model complex patterns due to the dimensionality constraints of the embedding space and lacking the capacity to model long-term contextual information. This inadequacy can lead to misjudgments of anomalous events due to insufficient video representation. However, hyperbolic space has shown significant potential for modeling complex data, offering new insights. In this paper, we rethink weakly supervised video anomaly detection with a novel perspective: transforming video features from Euclidean space into hyperbolic space may enable the network to learn implicit relationships in normal and anomalous videos, thereby enhancing its ability to effectively distinguish between them. Finally, to validate our approach, we conducted extensive experiments on the UCF-Crime and XD-Violence datasets. Experimental results show that our method not only has the lowest number of parameters but also achieves state-of-the-art performance on the XD-Violence dataset using only RGB information.
Read full abstract