Evaluation of skin ageing is a non-standardized, subjective process, with typical measures relying coarse, qualitatively defined features. Reflectance confocal microscopy depth stacks contain indicators of both chrono-ageing and photo-ageing. We hypothesize that an ageing scale could be constructed using machine learning and image analysis, creating a data-driven quantification of skin ageing without human assessment. En-face sections of reflectance confocal microscopy depth stacks from the dorsal and volar forearm of 74 participants (36/18/20 training/testing/validation) were represented using a histogram of visual features learned using unsupervised clustering of small image patches. A logistic regression classifier was trained on these histograms to differentiate between stacks from 20- to 30-year-old and 50- to 70-year-old volunteers. The probabilistic output of the logistic regression was used as the fine-grained ageing score for that stack in the testing set ranging from 0 to 1. Evaluation was performed in two ways: on the test set, the AUC was collected for the binary classification problem as well as by statistical comparison of the scores for age and body site groups. Final validation was performed by assessing the accuracy of the ageing score measurement on 20 depth stacks not used for training or evaluating the classifier. The classifier effectively differentiated stacks from age groups with a test set AUC of 0.908. Mean scores were significantly different when comparing age groups (mean 0.70 vs. 0.44; t=-6.62, p=0.0000) and also when comparing stacks from dorsal and volar body sites (mean 0.64 vs. 0.53; t=3.12, p=0.0062). On the final validation set, 17 out of 20 depth stacks were correctly labelled. Despite being limited to only coarse training information in the form of example stacks from two age groups, the trained classifier was still able to effectively discriminate between younger skin and older skin. Curiously, despite being only trained with chronological age, there was still evidence for measurable differences in age scores due to sun exposure-with marked differences in scores on sun-exposed dorsal sites of some volunteers compared with less sun-exposed volar sites. These results suggest that fine-grained data-driven quantification of skin ageing is achievable.
Read full abstract