PWISeg: Weakly-Supervised Surgical Instrument Instance Segmentation
AI-assisted operating room scene understanding is essential for the next generation of surgical interventions. Surgical instrument localization plays an important role in this context. However, existing instrument localization methods primarily focus on surgical instrument localization in endoscopy images and struggle with occlusions in broader operating room scenarios. In this work, we propose a weakly supervised instance segmentation framework, Pixel-driven Weakly-supervised Instance Segmentation (PWISeg), to solve the occluded instrument localization with low-cost annotations. Specifically, We utilize the projection relationship between the bounding box and the surgical instrument mask as a supervision signal to train PWISeg to predict coarse masks of surgical instruments. Then, we use the annotation of a few pixel points to train PWISeg to predict accurate masks of surgical instruments. To extensively validate the effectiveness, we collect and release a high-quality dataset, Surg-Inst that covers real-world hard cases of overlapping, dense placement, and various levels of instrument occlusion. Experiments demonstrate that our PWISeg achieves a remarkable performance advantage over state-of-the-art methods on both Surg-Inst and public HOSPI-Tools datasets.