Abstract

Computer vision plays an important role in intelligent systems, particularly for autonomous mobile robots and intelligent vehicles. It is essential to the correct operation of such systems, increasing safety for users/passengers and also for other people in the environment. One of its many levels of analysis is semantic segmentation, which provides powerful insights in scene understanding, a task of utmost importance in autonomous navigation. Recent developments have shown the power of deep learning models applied to semantic segmentation. Besides, 3D data shows up as a richer representation of the world. Although there are many studies comparing the performances of several semantic segmentation models, they mostly consider the task over 2D images and none of them include the recent GAN models in the analysis. In this paper, we carry out the study, implementation and comparison of recent deep learning models for 3D semantic image segmentation. We consider the FCN, SegNet and Pix2Pix models. The 3D images are captured indoors and gathered in a dataset created for the scope of this project. Our main objective is to evaluate and compare the models’ performances and efficiency in detecting obstacles, safe and unsafe zones for autonomous mobile robots navigation. Considering as metrics the mean IoU values, number of parameters and inference time, our experiments show that Pix2Pix, a recent Conditional Generative Adversarial Network, outperforms the FCN and SegNet models in the

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call