Abstract

Early detection of the presence of dangerous objects such as handguns in Closed-Circuit Television (CCTV) images is vital to reduce the potential damage. In this work, a novel method for automatic detection of handguns in CCTV-like images based on a combination architecture which leverages body pose estimation is proposed. Weapon appearance features along with body pose features are combined to perform robust detection in typical surveillance environments where appearance features alone are not sufficient (e.g., because the handgun may appear too small or dark). Both CNN and recent transformer-based architectures are applied for visual feature extraction. Experiments on multiple datasets show that this approach improves state-of-the-art pose-based handgun detectors. An ablation study is also performed to verify the contribution of the pose processing branch and the false positive filter.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call