Abstract

Automated machine learning approaches to skin lesion diagnosis from images are approaching dermatologist-level performance. However, current machine learning approaches that suggest management decisions rely on predicting the underlying skin condition to infer a management decision without considering the variability of management decisions that may exist within a single condition. We present the first work to explore image-based prediction of clinical management decisions directly without explicitly predicting the diagnosis. In particular, we use clinical and dermoscopic images of skin lesions along with patient metadata from the Interactive Atlas of Dermoscopy dataset (1011 cases; 20 disease labels; 3 management decisions) and demonstrate that predicting management labels directly is more accurate than predicting the diagnosis and then inferring the management decision (13.73 pm 3.93% and 6.59 pm 2.86% improvement in overall accuracy and AUROC respectively), statistically significant at p < 0.001. Directly predicting management decisions also considerably reduces the over-excision rate as compared to management decisions inferred from diagnosis predictions (24.56% fewer cases wrongly predicted to be excised). Furthermore, we show that training a model to also simultaneously predict the seven-point criteria and the diagnosis of skin lesions yields an even higher accuracy (improvements of 4.68 pm 1.89% and 2.24 pm 2.04% in overall accuracy and AUROC respectively) of management predictions. Finally, we demonstrate our model’s generalizability by evaluating on the publicly available MClass-D dataset and show that our model agrees with the clinical management recommendations of 157 dermatologists as much as they agree amongst each other.

Highlights

  • basal cell carcinoma (BCC) NEV MEL MISC seborrheic keratosis (SK) Training Validation Testing Total the performance of an artificial intelligence based automatic skin disease management prediction system

  • The Interactive Atlas of Dermoscopy dataset was used to test the performance of a model trained to predict the clinical management decisions ( MGMTpred ) compared with inferring the management decisions based on the outputs of a diagnosis prediction model ( MGMTinfr )

  • The MClass-D d­ ataset[30] was used to compare the diagnosis and the management prediction performance of our model with that of dermatologists. This dataset contains 100 dermoscopic images comprising of 80 benign nevi and 20 melanomas, as well as the responses of 157 dermatologists when asked to make a clinical management decision to each of these 100 images: ‘biopsy/further treatment’ (EXC) or ‘reassure the patient’ (NOEXC)

Read more

Summary

Introduction

BCC NEV MEL MISC SK Training Validation Testing Total the performance of an artificial intelligence based automatic skin disease management prediction system. Such a system can suggest management decisions to a clinician (i.e., as a second opinion) or directly to patients in under-served ­communities[22]. We evaluate our proposed method on the Interactive Atlas of Dermoscopy ­Dataset[27,28,29], the largest publicly available database containing both dermoscopy and clinical skin lesion images with the associated management decisions, and show that predicting management decisions directly is more accurate than inferring the management decision from a predicted diagnosis. We validate our model on the publicly available Melanoma Classification Benchmark (MClass-D)[18,30] and show that our model exhibits excellent generalization performance when evaluated on data from a different source, and that our model’s clinical management predictions are in agreement with those made by 157 dermatologists

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call