Abstract

While data-driven approaches excel at many image analysis tasks, the performance of these approaches is often limited by a shortage of annotated data available for training. Recent work in semi-supervised learning has shown that meaningful representations of images can be obtained from training with large quantities of unlabeled data, and that these representations can improve the performance of supervised tasks. Here, we demonstrate that an unsupervised jigsaw learning task, in combination with supervised training, results in up to a 9.8% improvement in correctly classifying lesions in colonoscopy images when compared to a fully-supervised baseline. We additionally benchmark improvements in domain adaptation and out-of-distribution detection, and demonstrate that semi-supervised learning outperforms supervised learning in both cases. In colonoscopy applications, these metrics are important given the skill required for endoscopic assessment of lesions, the wide variety of endoscopy systems in use, and the homogeneity that is typical of labeled datasets.

Highlights

  • Colorectal cancer is the second leading cause of cancer death and will cause a predicted 53,200 deaths in the United States in 2020 [1]

  • DATASET The colonoscopy video data used in this paper was collected at the Johns Hopkins Hospital using a protocol approved by the Johns Hopkins Institutional Review Board (#IRB00184221)

  • We developed a phased training model using a jigsaw solving task and observed improved performance in metrics including accuracy and F1 score when compared with a purely supervised model

Read more

Summary

Introduction

Colorectal cancer is the second leading cause of cancer death and will cause a predicted 53,200 deaths in the United States in 2020 [1]. Screening procedures are used to inspect the large intestine and rectum for precancerous lesions so that they may be removed prior to the onset of carcinoma These lesions come in a variety of geometries and textures, each with an associated risk of progressing to a cancerous state [3]. Multi-modal fusion of pixel-level information, such as color and depth, have been shown to improve classification accuracy [21], [22]. Still, none of these methods utilize the large quantities of available unlabeled data [23]

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.