Abstract

Colorectal cancer is one of the world leading death causes. Fortunately, an early diagnosis allows for effective treatment, increasing the survival rate. Deep learning techniques have shown their utility for increasing the adenoma detection rate at colonoscopy, but a dataset is usually required so the model can automatically learn features that characterize the polyps. In this work, we present the PICCOLO dataset, that comprises 3433 manually annotated images (2131 white-light images 1302 narrow-band images), originated from 76 lesions from 40 patients, which are distributed into training (2203), validation (897) and test (333) sets assuring patient independence between sets. Furthermore, clinical metadata are also provided for each lesion. Four different models, obtained by combining two backbones and two encoder–decoder architectures, are trained with the PICCOLO dataset and other two publicly available datasets for comparison. Results are provided for the test set of each dataset. Models trained with the PICCOLO dataset have a better generalization capacity, as they perform more uniformly along test sets of all datasets, rather than obtaining the best results for its own test set. This dataset is available at the website of the Basque Biobank, so it is expected that it will contribute to the further development of deep learning methods for polyp detection, localisation and classification, which would eventually result in a better and earlier diagnosis of colorectal cancer, hence improving patient outcomes.

Highlights

  • Colorectal Cancer (CRC) represents a 10% of overall new cases and presents higher incidence rate in developed countries [1] and could be considered a “lifestyle” disease associated with a diet high in calories and animal fat, and sedentarism [2]

  • The objective of this paper is to present the PICCOLO dataset with its associated clinical metadata and compare the performance results of different deep learning models trained with it and other publicly available datasets, as well as analyse the influence of the polyp morphology in the results

  • When reporting over the test set in CVC-EndoSceneStill, the best results are distributed between models trained with that same dataset and models trained with Kvasir-SEG

Read more

Summary

Introduction

Colorectal Cancer (CRC) represents a 10% of overall new cases and presents higher incidence rate in developed countries [1] and could be considered a “lifestyle” disease associated with a diet high in calories and animal fat, and sedentarism [2]. CRC detection increases the 5-year survival rate from 18% to 88.5% if diagnosed at an early stage. It is shown that higher ADR is associated with lower interval CRC rates [7], and that flat/sessile and small lesions are frequently more missed than pedunculated/sub-pedunculated and large ones [8,9]. Different approaches might be followed to improve ADR and reduce the number of missed lesions. During colonoscopy, these approaches include endoscope caps, positional manoeuvres, as well as the use of imaging modalities such as narrow-band imaging (NBI) which emphasize the capillary pattern and mucosa surface which emphasize the capillary pattern and mucosa surface [10]. It is clear that further development of Computer Assisted Diagnosis (CAD) systems is justified thanks to their potential to eventually improve the patient outcome [11]

Objectives
Methods
Results
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call