Abstract

Abstract Introduction Though widely used, current scar assessment scales are inaccurate and highly subjective, further complicating the already difficult task of determining the optimal management of burn patients. Additional disadvantages of these tools include the need for direct examination by an experienced clinician and the inability to retrospectively review them. The lack of an accurate assessment tool inevitably impairs any research examining novel therapeutic strategies designed to improve burn scar outcomes by introducing observer bias at every step. Common examples of these tools include the Vancouver Scar Scale and Visual analog scale. New imaging and processing technologies have the potential of bringing accuracy, reproducibility, and accessibility to burn scar assessments. With these goals in mind, our team developed a novel scoring system and a classification model based on Machine Learning algorithms and analyzed 87 pictures to obtain scores on Inflammation (I), Scar (S), Uniformity (U), and Pigmentation (P). Methods All algorithms were trained using both the sub-acute and the long-term phase pictures. The classification model is based on supervised learning, which requires many examples of annotated pictures and corresponding scar scores. The model used a Linear Discriminant Analysis (LDA) algorithm and visual features of the scars and the natural skin. To train and evaluate this model, four burn care providers individually annotated 186 pictures of skin grafts and later formed a committee to annotate by consensus a subset of representative pictures. While the individual predictions were used as an accuracy baseline, the consensus annotation was the true score and used to train the model. Results The model predictions were more accurate in scores mainly based on color (I and P), rather than texture (S and U), as shown by the micro-averaged Area Under the Curve (AUC) of 0.86, 0.61, 0.51, and 0.80 for I, S, U, and P, respectively (Figure 1). The model accuracy was higher than the human baseline for the I (F1 of 0.60 vs. 0.59±0.13, respectively) and P scores (0.54 vs. 0.51±0.09), but lower in the S (0.30 vs. 0.63±0.22) and U scores (0.62 vs. 0.86±0.19). Conclusions Our findings are encouraging and suggest that further improvement of the accuracy of the algorithm could be achieved on the second phase of our assessment development project by increasing the number of pictures it learns from and adding more visual features related to skin texture. Applicability of Research to Practice Our study provides an accurate and reproducible evaluation of burn scars, that leads to newer therapeutic strategies employed by specialized burn care facilities.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call