Abstract

• We propose the GDM to partition the sample space for improving the discriminability of the scene-specific object detector. • We propose the online gradual optimization procedure to automatically improve the performance of the GDM-based detector. • We propose a self-learning framework to train the GDM-based detector by marking several bounding boxes in the first frame. One object class may show large variations due to diverse illuminations, backgrounds, and camera viewpoints in the multi-scene object detection task. Traditional object detection methods generally perform poorly under unconstrained video environments. To address this problem, many modern approaches provide deep hierarchical appearance representations for object detection. Most of these methods require time-consuming training procedures on large manually annotated sample sets. In this paper, we propose a self-learning object detection framework to resolve the multi-scene detection problem in a bottom-up manner. A scene-specific objector is obtained from an autonomous learning process triggered by marking several bounding boxes around an object in the first video frame via a mouse. Here, artificially labeled training data or generic detectors are not needed. This learning process is conveniently replicated many times in different surveillance scenarios and produces scene-specific detectors from various camera viewpoints. Obviously, the initial scene-specific detector, initialized by several bounding boxes, exhibits poor detection performance and is difficult to be improved by traditional online learning algorithms . Consequently, we propose the Generative-Discriminative model (GDM) based detection method to partition detection response space and assign each partition an individual descriptor that progressively achieves high classification accuracy . Online gradual optimization process is proposed to optimize the Generative-Discriminative model and focus on those hard samples lying near the decision boundary. Experimental results on nine video datasets show that our approach achieves comparable performance to that of robust supervised methods, and outperforms state-of-the-art scene-specific object detection methods under varying imaging conditions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call