Abstract
Contemporary endoscopic simultaneous localization and mapping (SLAM) methods accurately compute endoscope poses; however, they only provide a sparse 3-D reconstruction that poorly describes the surgical scene. We propose a novel dense SLAM method whose qualities are: 1) monocular, requiring only RGB images of a handheld monocular endoscope; 2) fast, providing endoscope positional tracking and 3-D scene reconstruction, running in parallel threads; 3) dense, yielding an accurate dense reconstruction; 4) robust, to the severe illumination changes, poor texture and small deformations that are typical in endoscopy; and 5) self-contained, without needing any fiducials nor external tracking devices and, therefore, it can be smoothly integrated into the surgical workflow. It works as follows. First, accurate cluster frame poses are estimated using the sparse SLAM feature matches. The system segments clusters of video frames according to parallax criteria. Next, dense matches between cluster frames are computed in parallel by a variational approach that combines zero mean normalized cross correlation and a gradient Huber norm regularizer. This combination copes with challenging lighting and textures at an affordable time budget on a modern GPU. It can outperform pure stereo reconstructions, because the frames cluster can provide larger parallax from the endoscope's motion. We provide an extensive experimental validation on real sequences of the porcine abdominal cavity, both in-vivo and ex-vivo. We also show a qualitative evaluation on human liver. In addition, we show a comparison with the other dense SLAM methods showing the performance gain in terms of accuracy, density, and computation time.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have