Abstract

Person detection is an important problem in computer vision with many real-world applications. The detection of a person is still a challenging task due to variations in pose, occlusions and lighting conditions. The purpose of this study is to detect human heads in natural scenes acquired from a publicly available dataset of Hollywood movies. In this work, we have used state-of-the-art object detectors based on deep convolutional neural networks. These object detectors include region-based convolutional neural networks using region proposals for detections. Also, object detectors that detect objects in the single-shot by looking at the image only once for detections. We have used transfer learning for fine-tuning the network already trained on a massive amount of data. During the fine-tuning process, the models having high mean Average Precision (mAP) are used for evaluation of the test dataset. Experimental results show that Faster R-CNN [18] and SSD MultiBox [13] with VGG16 [21] perform better than YOLO [17] and also demonstrate significant improvements against several baseline approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.