Abstract

This paper introduces a fast and accurate object detection algorithm based on a convolutional neural network for humanoid marathon robot applications. The algorithm is capable of operating on a low-performance CPU without relying on the GPU or hardware accelerator. A new region proposal algorithm, based on color segmentation, is proposed to extract a region containing a potential object. As a classifier, the convolution neural network is used to predict object classes from the proposed region. In the training phase, the classifier is trained with an Adam optimizer to minimize the loss function, using datasets collected from humanoid marathon competitions and diversified using image augmentation. An NVIDIA GTX 1070 training machine, with 500 batch images per epoch and a learning rate of 0.001, required 12 seconds to minimize the loss value below 0.0374. In the accuracy evaluation, the proposed method successfully recognizes and localizes three classes of marker with a training accuracy of 99.929%, validation accuracy of 99.924%, and test accuracy of 98.821%. As a real-time benchmark, the algorithm achieves 41.13 FPS while running on a robot computer with Intel i3-5010U CPU @ 2.10GHz.

Highlights

  • Object detection is the most common problem for robotic vision

  • State of the art of Convolutional Neural Network (CNN) based object detection successfully addresses the problems in this domain, even though it requires a high computing platform

  • Inference Results The inference results for our object detection approach were evaluated on robot computing hardware (Figure 10) for each marker: left, right, and forward

Read more

Summary

Introduction

Object detection is the most common problem for robotic vision. The class and position of the object must be predicted. State of the art of Convolutional Neural Network (CNN) based object detection successfully addresses the problems in this domain, even though it requires a high computing platform. Santos et al compared the performance of three CNN algorithms to detect tree species using an RGB camera on an Unmanned Aerial Vehicle (UAV) [1]. Faster R-CNN [2], YOLOv3 [3], and RetinaNet [4] were evaluated using a folding approach and successfully reached over 82.48 % validation accuracy. The detection process was offline, used an off-board computing platform, and was not performed in real-time

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call