Abstract

Abstract In endoscopy, depth estimation is a task that potentially helps in quantifying visual information for better scene understanding. A plethora of depth estimation algorithms have been proposed in the computer vision community. The endoscopic domain however, differs from the typical depth estimation scenario due to differences in the setup and nature of the scene. Furthermore, it is unfeasible to obtain ground truth depth information owing to an unsuitable detection range of off-the-shelf depth sensors and difficulties in setting up a depth-sensor in a surgical environment. In this paper, an existing self-supervised approach, called Monodepth [1], from the field of autonomous driving is applied to a novel dataset of stereo-endoscopic images from reconstructive mitral valve surgery. While it is already known that endoscopic scenes are more challenging than outdoor driving scenes, the paper performs experiments to quantify the comparison, and describe the domain gap and challenges involved in the transfer of these methods.

Highlights

  • The task of depth estimation is a commonly encountered problem in computer vision

  • Sparse depth estimation methods focus on identifying matching feature points, or matching image patches [4]

  • Endoscopic scenes in the case of mitral valve repair are prone to specularities, reflection and occlusion artefacts

Read more

Summary

Introduction

The task of depth estimation is a commonly encountered problem in computer vision. Beyond prevalent applications in the field of autonomous driving and robotic navigation, depth estimation finds use for endoscopy in aDepth estimation has been tackled through various approaches in the literature. The datasets comprise of depth information acquired by depth sensors such as infrared or LiDAR cameras [5] as the ground truth to supervise the learning. In the case where the ground truth information is not available, the supervision comes from motion or binocular parallax, in other words additional information from the temporal or spatial domain [6]. The acquisition of ground truth depth information is unfeasible due to logistical and safety considerations. Endoscopic scenes in the case of mitral valve repair are prone to specularities, reflection and occlusion artefacts. Occlusions occur due to tissue or instruments partially obstructing the endoscopic field of view, and may persist for a major part of the surgery. The paper examines how existing self-supervised depth estimation approaches address this domain gap in endoscopy, in particular for mitral valve repair

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.