Abstract
Abstract The Faster Region-based Convolutional Network (Faster R-CNN) was recently proposed achieving outstanding performance for object detection. Specially, a Region Proposal Network (RPN) is designed to efficiently predict region proposals with a wide range of scales and aspect ratios in Faster R-CNN. Nevertheless, once the number and quality of region proposals generated by RPN are not ideal the object detection performance of Faster R-CNN is affected. In this paper, multiple strategies are applied to address these limitations and improve RPN. Hence, a novel architecture for region proposal generation is presented which is named as Multi-strategy Region Proposal Network (MSRPN). Four improvements are presented in MSRPN. Firstly, a novel skip-layer connection network is designed for combining multi-level features and boosting the ability of pooling layers. Thereupon, the quality of region proposals is strengthened. Secondly, improved anchor boxes are introduced with adaptive aspect ratio and evenly distributed interval of selected scales. In this way, the number of predicted region proposals for detection is seriously reduced and the efficiency of object localization is increased. Particularly, the capability of small object detection is enhanced by applying the first and second improvements. Thirdly, classification layer and regression layer are unified as a single convolutional layer. Furthermore, the model complexity of output layer is reduced. Thus, the speed of training and testing is accelerated. Fourthly, the bounding box regression part of multi-task loss function in RPN is improved. Consequently, the performance of bounding box regression is promoted. In the experiment, MSRPN is compared with the Fast Region-based Convolutional Network (Fast R-CNN), Faster R-CNN, Inside-Outside Net (ION), Multi-region CNN (MR-CNN) and HyperNet approaches. MSRPN achieves the state-of-the-art mean average precision (mAP) of 78.9%, 74.8% and 32.1% on PASCAL VOC 2007, 2012 and MS COCO data sets with the deep VGG-16 model, surpassing other five object detection methods. Simultaneously, the above experiment results are obtained by MSRPN with only 150 region proposals per image. Additionally, MSRPN gets excellent performance on small object detection. Furthermore, MSRPN runs at 6 fps which is faster than other methods. In conclusion, the MSRPN method can provide important support for the intelligent object detection systems.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have