Multi-task learning with cross-task consistency for improved depth estimation in colonoscopy

Pedro Esteban Chavarrias Solano,Andrew Bulpitt,Venkataraman Subramanian,Sharib Ali

doi:10.1016/j.media.2024.103379

Abstract

Colonoscopy screening is the gold standard procedure for assessing abnormalities in the colon and rectum, such as ulcers and cancerous polyps. Measuring the abnormal mucosal area and its 3D reconstruction can help quantify the surveyed area and objectively evaluate disease burden. However, due to the complex topology of these organs and variable physical conditions, for example, lighting, large homogeneous texture, and image modality estimating distance from the camera (aka depth) is highly challenging. Moreover, most colonoscopic video acquisition is monocular, making the depth estimation a non-trivial problem. While methods in computer vision for depth estimation have been proposed and advanced on natural scene datasets, the efficacy of these techniques has not been widely quantified on colonoscopy datasets. As the colonic mucosa has several low-texture regions that are not well pronounced, learning representations from an auxiliary task can improve salient feature extraction, allowing estimation of accurate camera depths. In this work, we propose to develop a novel multi-task learning (MTL) approach with a shared encoder and two decoders, namely a surface normal decoder and a depth estimator decoder. Our depth estimator incorporates attention mechanisms to enhance global context awareness. We leverage the surface normal prediction to improve geometric feature extraction. Also, we apply a cross-task consistency loss among the two geometrically related tasks, surface normal and camera depth. We demonstrate an improvement of 15.75% on relative error and 10.7% improvement on δ1.25 accuracy over the most accurate baseline state-of-the-art Big-to-Small (BTS) approach. All experiments are conducted on a recently released C3VD dataset, and thus, we provide a first benchmark of state-of-the-art methods on this dataset.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Multi-task learning with cross-task consistency for improved depth estimation in colonoscopy

Abstract

Published Version

Talk to us

Similar Papers

More From: Medical Image Analysis

Lead the way for us

Journal: Medical Image Analysis	Publication Date: Jan 1, 2025
License type: cc-by

Similar Papers

Composite Learning for Robust and Effective Dense Predictions
Menelaos Kanakis ... Luc Van Gool
-
Menelaos Kanakis, et. al.Menelaos Kanakis ... Luc Van Gool
01 Jan 2023
01 Jan 2023

Multitask and Transfer Learning Approach for Joint Classification and Severity Estimation of Dysphonia.
Dosti Aziz ... Sztahó Dávid
IEEE Journal of Translational Engineering in Health and Medicine | VOL. 12
Dosti Aziz, et. al.Dosti Aziz ... Sztahó Dávid
01 Jan 2024
IEEE Journal of Translational Engineering in Health and Medicine | VOL. 12

Using Eye-tracking Data to Predict the Readability of Brazilian Portuguese Sentences in Single-task, Multi-task and Sequential Transfer Learning Approaches
Sidney Evaldo Leal ... Elisângela Nogueira Teixeira
-
Sidney Evaldo Leal, et. al.Sidney Evaldo Leal ... Elisângela Nogueira Teixeira
01 Jan 2020
01 Jan 2020

Using Eye-tracking Data to Predict the Readability of Brazilian Portuguese Sentences in Single-task, Multi-task and Sequential Transfer Learning Approaches
...
-
, et. al. ...
25 Nov 2020
25 Nov 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Multi-task learning with cross-task consistency for improved depth estimation in colonoscopy

Abstract

Published Version

Talk to us

Similar Papers

More From: Medical Image Analysis