Abstract

Localizing instrument parts in video-assisted surgeries is an attractive and open computer vision problem. A working algorithm would immediately find applications in computer-aided interventions in the operating theater. Knowing the location of tool parts could help virtually augment visual faculty of surgeons, assess skills of novice surgeons, and increase autonomy of surgical robots. A surgical tool varies in appearance due to articulation, viewpoint changes, and noise. We introduce a new method for detection and pose estimation of multiple non-rigid and robotic tools in surgical videos. The method uses a rigidly structured, bipartite model of end-effector and shaft parts that consistently encode diverse, pose-specific appearance mixtures of the tool. This rigid part mixtures model then jointly explains the evolving tool structure by switching between mixture components. Rigidly capturing end-effector appearance allows explicit transfer of keypoint meta-data of the detected components for full 2D pose estimation. The detector can as well delineate precise skeleton of the end-effector by transferring additional keypoints. To this end, we propose effective procedure for learning such rigid mixtures from videos and for pooling the modeled shaft part that undergoes frequent truncation at the border of the imaged scene. Notably, extensive diagnostic experiments inform that feature regularization is a key to fine-tune the model in the presence of inherent appearance bias in videos. Experiments further illustrate that estimation of end-effector pose improves upon including the shaft part in the model. We then evaluate our approach on publicly available datasets of in-vivo sequences of non-rigid tools and demonstrate state-of-the-art results.

Highlights

  • Facilitating video-assisted surgeries belongs to main objectives for developing next-generation operating theaters

  • We demonstrate that a structured part-based model can be successfully applied to detection and pose estimation of surgical instruments

  • We focus on capturing the appearance of surgical tools

Read more

Summary

Introduction

Facilitating video-assisted surgeries belongs to main objectives for developing next-generation operating theaters. A surgeon controls surgical instruments either robotically or manually. Invasive surgeries 5 and microsurgeries involve vision sensors that help surgeons correctly position the instruments onto operated tissue areas. Carrying out the surgeries is not easy though. In minimally invasive surgeries the surgeons insert elongated surgical instruments through keyhole incisions in the body thereby compromising the dexterity of maneuvers within the body. Delicate, retinal 10 microsurgery requires high precision in placing the instruments over retina after eye surface penetration. The surgeons struggle to perceive depth well and lack tactile feedback using the available surgical vision technology. Augmenting the surgeon’s vision with helpful yet unobstructive metadata and robotized support appear as attractive, potential improvements to existing 15 surgical workflow

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.