Abstract

Object detection is one of the most active areas in computer vision, which has made significant improvement in recent years. Current state-of-the-art object detection methods mostly adhere to the framework of regions with convolutional neural network (R-CNN) and only use local appearance features inside object bounding boxes. Since these approaches ignore the contextual information around the object proposals, the outcome of these detectors may generate a semantically incoherent interpretation of the input image. In this paper, we propose an ensemble object detection system which incorporates the local appearance, the contextual information in term of relationships among objects and the global scene based contextual feature generated by a convolutional neural network. The system is formulated as a fully connected conditional random field (CRF) defined on object proposals and the contextual constraints among object proposals are modeled as edges naturally. Furthermore, a fast mean field approximation method is utilized to inference in this CRF model efficiently. The experimental results demonstrate that our approach achieves a higher mean average precision (mAP) on PASCAL VOC 2007 datasets compared to the baseline algorithm Faster R-CNN.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.