An attention-guided multi-scale fusion network for surgical instrument segmentation

Mengqiu Song,Chenxu Zhai,Lei Yang,Yanhong Liu,Guibin Bian

doi:10.1016/j.bspc.2024.107296

Mengqiu Song, Chenxu Zhai + Show 3 more

https://doi.org/10.1016/j.bspc.2024.107296

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

In contemporary surgical practice, minimally invasive surgery has significantly alleviated the physiological and psychological strain on patients while dramatically curtailing their recovery periods. Within the realm of robot-assisted minimally invasive surgery, the precise segmentation of surgical instruments assumes paramount importance, as it not only enhances the precision with which surgeons execute surgical maneuvers but also fortifies the overall perioperative safety of patients. Despite these benefits, the accurate segmentation of surgical instruments remains beset by a multitude of challenges, emanating primarily from the intricacy of the surgical milieu, specular reflection, diverse instruments, etc. To efficaciously confront these challenges, this paper introduces a novel attention-guided multi-scale fusion network. Specifically, to facilitate effective feature representation, an effective backbone network leveraging Octave convolution is constructed to mitigate feature redundancy. Simultaneously, the encoding path incorporates the Transformer module into bottleneck layer to infuse global contextual information, thereby synergistically capturing both global and local feature information. Moreover, a dual attention fusion block and a context feature fusion block are ingeniously integrated into the skip connections to refine local features, to meticulously discern edge details and effectively suppress the interference of useless information. Lastly, this paper presents an adaptive multi-Scale feature weighting block, which adeptly fuses multi-scale features from disparate layers within the decoding path. To rigorously substantiate the performance of proposed model, comprehensive experimentation is conducted on two widely recognized benchmark datasets. The results reach a Dice score of 96.34% and a mIOU value of 96.14% on kvasir-instrument dataset. Meanwhile, it also reaches a Dice score of 97.31% and a mIOU value of 96.15% on Endovis2017 dataset. Experiments show that it attests to the substantial superiority of proposed network in terms of accuracy and robustness against with advanced segmentation models. Therefore, proposed model could offer a promising solution to enhance the precision and safety of robot-assisted minimally invasive surgeries.

Full Text