Unsupervised binocular depth prediction network for laparoscopic surgery.

Ke Xu,Zhiyong Chen,Fucang Jia

doi:10.1080/24699322.2018.1560082

Abstract

Minimally invasive surgery (MIS) is characterized by less trauma, shorter recovery time, and lower postoperative infection rate. The two-dimensional (2D) laparoscopic imaging lacks depth perception and does not provide quantitative depth information, thereby limiting precise and complex surgical operations. Three-dimensional (3D) laparoscopic imaging provides surgeons depth perception. This study aims to 3D reconstruction of the surgical scene based on the disparity map generated by the depth estimation algorithm. An unsupervised learning autoencoder method was proposed to calculate the accurate disparity with a 101-layer residual convolutional network. The loss function included three parts: left-right consistency loss, structure similarity loss, and reconstruction error loss, the combination can improve reconstruction accuracy and robustness. The method was validated on a Hamlyn Center Laparoscopic/Endoscopic Video Dataset. The structural similarity index (SSIM) is 0.8349 ± 0.0523 and the peak signal-to-noise ratio (PSNR) is 14.4957 ± 1.9676. The depth prediction network has high accuracy and robustness. The average time to produce each disparity map is about 16 ms. The experimental result shows that the proposed depth estimation method can offer dense disparity map, and can meet surgical real-time requirement. Future work will focus on network structure optimization and loss function design, transfer learning to improve the robustness and accuracy further.

Full Text