Abstract

This paper addresses the problem of detecting people and vehicles on a surface mine by presenting an architecture that combines the complementary strengths of deep convolutional networks (DCN) with cluster‐based analysis. We highlight that using a DCN in a naïve black box approach results in a significantly high rate of errors due to the lack of mining‐specific training data and the unique landscape in a mine site. In this work, we propose a background model that exploits the abundance of background‐only images to discover the natural clusters in visual appearance using features extracted from the DCN. Both a simple nearest cluster‐based background model and an extended model with cosine features are investigated for their ability to identify and suppress potential false positives made by the DCN. Furthermore, localization of objects of interest is enabled through region proposals, which have been tuned to increase recall within the constraints of a computational budget. Finally, a soft fusion framework is presented to combine the estimates of both the DCN and background model to improve the accuracy of the detection. Our system is tested on over 11 km of real mine site data in both day and night conditions where we were able to detect both light and heavy vehicles along with mining personnel. We show that the introduction of our background model improves the detection performance. In particular, soft fusion of the background model and the DCN output produces a relative improvement in the F1 score of 46% and 28% compared to a baseline pretrained DCN and a DCN retrained with mining images, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call