Semantic SLAM Based on Deep Learning in Endocavity Environment

Haibin Wu,Yuji Iwahori,Ruotong Xu,Aili Wang,Yan Zhang,Kaiyang Xu,Jianbo Zhao

doi:10.3390/sym14030614

Haibin Wu, Yuji Iwahori + Show 5 more

Open Access

https://doi.org/10.3390/sym14030614

Copy DOI

Abstract

Traditional endoscopic treatment methods restrict the surgeon’s field of view. New approaches to laparoscopic visualization have emerged due to the advent of robot-assisted surgical techniques. Lumen simultaneous localization and mapping (SLAM) technology can use the image sequence taken by the endoscope to estimate the pose of the endoscope and reconstruct the lumen scene in minimally invasive surgery. This technology gives the surgeon better visual perception and is the basis for the development of surgical navigation systems as well as medical augmented reality. However, the movement of surgical instruments in the internal cavity can interfere with the SLAM algorithm, and the feature points extracted from the surgical instruments may cause errors. Therefore, we propose a modified endocavity SLAM method combined with deep learning semantic segmentation that introduces a convolution neural network based on U-Net architecture with a symmetric encoder–decoder structure in the visual odometry with the goals of solving the binary segmentation problem between surgical instruments and the lumen background and distinguishing dynamic feature points. Its segmentation performance is improved by using pretrained encoders on the network model to obtain more accurate pixel-level instrument segmentation. In this setting, the semantic segmentation is used to reject the feature points on the surgical instruments and reduce the impact caused by dynamic surgical instruments. This can provide more stable and accurate mapping results compared to ordinary SLAM systems.

Full Text