Abstract

Among the greatest of the challenges of minimally invasive surgery (MIS) is the inadequate visualisation of the surgical field through keyhole incisions. Moreover, occlusions caused by instruments or bleeding can completely obfuscate anatomical landmarks, reduce surgical vision and lead to iatrogenic injury. The aim of this paper is to propose an unsupervised end-to-end deep learning framework, based on fully convolutional neural networks to reconstruct the view of the surgical scene under occlusions and provide the surgeon with intraoperative see-through vision in these areas. A novel generative densely connected encoder-decoder architecture has been designed which enables the incorporation of temporal information by introducing a new type of 3D convolution, the so called 3D partial convolution, to enhance the learning capabilities of the network and fuse temporal and spatial information. To train the proposed framework, a unique loss function has been proposed which combines feature matching, reconstruction, style, temporal and adversarial loss terms, for generating high fidelity image reconstructions. Advancing the state-of-the-art, our method can reconstruct the underlying view obstructed by irregularly shaped occlusions of divergent size, location and orientation. The proposed method has been validated on in vivo MIS video data, as well as natural scenes on a range of occlusion-to-image (OIR) ratios. It has also been compared against the latest video inpainting models in terms of image reconstruction quality using different assessment metrics. The performance evaluation analysis verifies the superiority of our proposed method and its potential clinical value.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call