Although significant advancements in computer-aided diagnostics using artificial intelligence (AI) have been made, to date, no viable method for radiation-induced skin reaction (RISR) analysis and classification is available. The objective of this single-center study was to develop machine learning and deep learning approaches using deep convolutional neural networks (CNNs) for automatic classification of RISRs according to the Common Terminology Criteria for Adverse Events (CTCAE) grading system. ScarletredⓇ Vision, a novel and state-of-the-art digital skin imaging method capable of remote monitoring and objective assessment of acute RISRs was used to convert 2D digital skin images using the CIELAB color space and conduct SEV* measurements. A set of different machine learning and deep convolutional neural network-based algorithms has been explored for the automatic classification of RISRs. A total of 2263 distinct images from 209 patients were analyzed for training and testing the machine learning and CNN algorithms. For a 2-class problem of healthy skin (grade 0) versus erythema (grade ≥ 1), all machine learning models produced an accuracy of above 70%, and the sensitivity and specificity of erythema recognition were 67–72% and 72–83%, respectively. The CNN produced a test accuracy of 74%, sensitivity of 66%, and specificity of 83% for predicting healthy and erythema cases. For the severity grade prediction of a 3-class problem (grade 0 versus 1 versus 2), the overall test accuracy was 60–67%, and the sensitivities were 56–82%, 35–59%, and 65–72%, respectively. For estimating the severity grade of each class, the CNN obtained an accuracy of 73%, 66%, and 82%, respectively. Ensemble learning combines several individual predictions to obtain a better generalization performance. Furthermore, we exploited ensemble learning by deploying a CNN model as a meta-learner. The ensemble CNN based on bagging and majority voting shows an accuracy, sensitivity and specificity of 87%, 90%, and 82% for a 2-class problem, respectively. For a 3-class problem, the ensemble CNN shows an overall accuracy of 66%, while for each grade (0, 1, and 2) accuracies were 76%, 69%, and 87%, sensitivities were 70%, 57%, and 71%, and specificities were 78%, 75%, and 95%, respectively. This study is the first to focus on erythema in radiation-dermatitis and produces benchmark results using machine learning models. The outcome of this study validates that the proposed system can act as a pre-screening and decision support tool for oncologists or patients to provide fast, reliable, and efficient assessment of erythema grading.
Read full abstract