PaI‐Net: A modified U‐Net of reducing semantic gap for surgical instrument segmentation

Xiaoyan Wang,Ming Xia,Luyao Wang,Ruiyi Zhao,Xingyu Zhong,Cong Bai,Xiaojie Huang

doi:10.1049/ipr2.12283

Abstract

Tracking the instruments in a surgical scene is an essential task in minimally invasive surgery. However, due to the unpredictability of scenes, automatically segmenting the instruments is very challenging. In this paper, a novel method named parallel inception network (PaI-Net) is proposed, in which an attention parallel module (APM) and an output fusion module (OFM) are integrated with U-Net to improve the segmentation ability. Specially, APM utilizes multi-scale convolution kernels and global average pooling operations to extract semantic information and global context information of different scales, while OFM combines the feature maps of the decoder part to aggregate the abundant boundary information of shallow layers and the rich semantic information of deep layers together, which achieve a significant improvement in generating segmentation masks. Finally, the evaluation of proposed method on robotic instruments segmentation task from Medical Image Computing and Computer Assisted Intervention Society (MICCAI) and retinal image segmentation task from International Symposium on Biomedical Imaging (ISBI) show that our model has achieved advanced performance on multi-scale semantic segmentation and is superior to the current state-of-the-art models.

Full Text