Colonoscopy is the gold standard for early diagnosis and pre-emptive treatment of colorectal cancer by detecting and removing colonic polyps. Deep learning approaches to polyp detection have shown potential for enhancing polyp detection rates. However, the majority of these systems are developed and evaluated on static images from colonoscopies, whilst in clinical practice the treatment is performed on a real-time video feed. Non-curated video data remains a challenge, as it contains low-quality frames when compared to still, selected images often obtained from diagnostic records. Nevertheless, it also embeds temporal information that can be exploited to increase predictions stability. A hybrid 2D/3D convolutional neural network architecture for polyp segmentation is presented in this paper. The network is used to improve polyp detection by encompassing spatial and temporal correlation of the predictions while preserving real-time detections. Extensive experiments show that the hybrid method outperforms a 2D baseline. The proposed architecture is validated on videos from 46 patients and on the publicly available SUN polyp database. A higher performance and increased generalisability indicate that real-world clinical implementations of automated polyp detection can benefit from the hybrid algorithm and the inclusion of temporal information.
Read full abstract