Abstract

The frequency of basal cell carcinoma (BCC) cases is putting an increasing strain on dermatopathologists. BCC is the most common type of skin cancer, and its incidence is increasing rapidly worldwide. AI can play a significant role in reducing the time and effort required for BCC diagnostics and thus improve the overall efficiency of the process. To train such an AI system in a fully-supervised fashion however, would require a large amount of pixel-level annotation by already strained dermatopathologists. Therefore, in this study, our primary objective was to develop a weakly-supervised for the identification of basal cell carcinoma (BCC) and the stratification of BCC into low-risk and high-risk categories within histopathology whole-slide images (WSI). We compared Clustering-constrained Attention Multiple instance learning (CLAM) with StreamingCLAM and hypothesized that the latter would be the superior approach. A total of 5147 images were used to train and validate the models, which were subsequently tested on an internal set of 949 images and an external set of 183 images. The labels for training were automatically extracted from free-text pathology reports using a rule-based approach. All data has been made available through the COBRA dataset. The results showed that both the CLAM and StreamingCLAM models achieved high performance for the detection of BCC, with an area under the ROC curve (AUC) of 0.994 and 0.997, respectively, on the internal test set and 0.983 and 0.993 on the external dataset. Furthermore, the models performed well on risk stratification, with AUC values of 0.912 and 0.931, respectively, on the internal set, and 0.851 and 0.883 on the external set. In every single metric the StreamingCLAM model outperformed the CLAM model or is on par. The performance of both models was comparable to that of two pathologists who scored 240 BCC positive slides. Additionally, in the public test set, StreamingCLAM demonstrated a comparable AUC of 0.958, markedly superior to CLAM’s 0.803. This difference was statistically significant and emphasized the strength and better adaptability of the StreamingCLAM approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call