Zero‐shot obstacle detection using panoramic vision in farmland

Tianhai Wang,Man Zhang,Ning Wang,Bin Chen,Han Li,Yuhan Ji

doi:10.1002/rob.22224

Abstract

AbstractReliable obstacle detection is of great significance to the navigation technology of unmanned agricultural machinery. Currently, most of the previous works have achieved significant performance with the help of visual prior information of obstacles, where visual prior information refers to the visual features learned by models in the training stage. However, collecting enough annotated images for the training stage can be challenging. In the absence of annotated images, the current methods cannot perform optimally. To address the above problem, this paper presents a zero‐shot obstacle detection model based on the You Only Look Once X backbone, introducing the concept of zero‐shot learning into real‐time obstacle detection systems. Specifically, the cascade of encoder and decoder modules is appended to the presented model, and the integration of semantic space‐based classification and anchor‐free localization modules is used for zero‐shot obstacle detection. The feasibility of the proposed method was verified on the actual farmland test data set. The experimental results show that the F1 scores of the seen obstacle (e.g., people) and the unseen obstacle (e.g., agricultural machinery) are 96.22% and 94.66%, respectively. The average detection time for each panoramic image is 52.52 ms (equivalent to 19.04 FPS). The proposed obstacle detection method exhibits superior performance in the situation where training samples of the target category are not available. Notably, the proposed model not only performs the correct classification of unseen obstacles, but also improves the detection performance of both seen and unseen obstacles. The proposed method achieves a balance between accuracy and detection speed.

Full Text