Real-Time Detection of Multiple Targets from a Moving 360$$^{\circ }$$ Panoramic Imager in the Wild

Boyan Yuan,Nabil Belbachir

doi:10.1007/978-3-030-68238-5_8

Boyan Yuan, Nabil Belbachir

https://doi.org/10.1007/978-3-030-68238-5_8

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Our goal is to develop embedded and mobile vision applications leveraging state-of-the-art visual sensors and efficient neural network architectures deployed on emerging neural computing engines for smart monitoring and inspection purposes. In this paper, we present 360$^{\circ }$ vision system onboard an automobile or UAV platform for large field-of-view and real-time detection of multiple challenging objects. The targeted objects include flag as a deformable object; UAV as a tiny, flying object which changes its scales and positions rapidly; and grouped objects containing piled sandbags as deformable objects in a group themselves, flag and stop sign to form a scene representing an artificial fake checkpoint. Barrel distortions owing to the 360$^{\circ }$ optics make the detection task even more challenging. A light-weight neural network model based on MobileNets architecture is transfer learned for detection of the custom objects with very limited training data. In method 1, we generated a dataset of perspective planar images via a virtual camera model which projects a patch on the hemisphere to a 2D plane. In method 2, the panomorph images are directly used without projection. Real-time detection of the objects in 360$^{\circ }$ video is realized by feeding live streamed frames captured by the full hemispheric (180$^{\circ }$ $\times $ 360$^{\circ }$) field-of-view ImmerVision Enables panomorph lens to the trained MobileNets model. We found that with only few training data which is far less than 10 times of Vapnik–Chervonenkis dimension of the model, the MobileNets model achieves a detection rate of 80–90% for test data having a similar distribution as the training data. However, the model performance dropped drastically when it was put in action in the wild for unknown data in which both weather and lighting conditions were different. The generalization capability of the model can be improved by training with more data. The contribution of this work is a 360$^{\circ }$ vision hardware and software system for real-time detection of challenging objects. This system could be configured for very low-power embedded applications by running inferences via a neural computing engine such as Intel Movidius NSC2 or HiSilicon Kirin 970.

Full Text