BackgroundAlthough treatments have been proposed for calcinosis cutis (CC) in patients with systemic sclerosis (SSc), a standardized and validated method for CC burden quantification is necessary to enable valid clinical trials. We tested the hypothesis that computer vision applied to dual-energy computed tomography (DECT) finger images is a useful approach for precise and accurate CC quantification in SSc patients.MethodsDe-identified 2-dimensional (2D) DECT images from SSc patients with clinically evident lesser finger CC lesions were obtained. An expert musculoskeletal radiologist confirmed accurate manual segmentation (subtraction) of the phalanges for each image as a gold standard, and a U-Net Convolutional Neural Network (CNN) computer vision model for segmentation of healthy phalanges was developed and tested. A validation study was performed in an independent dataset whereby two independent radiologists manually measured the longest length and perpendicular short axis of each lesion and then calculated an estimated area by assuming the lesion was elliptical using the formula long axis/2 × short axis/2 × π, and a computer scientist used a region growing technique to calculate the area of CC lesions. Spearman’s correlation coefficient, Lin’s concordance correlation coefficient with 95% confidence intervals (CI), and a Bland-Altman plot (Stata V 15.1, College Station, TX) were used to test for equivalence between the radiologists’ and the CNN algorithm-generated area estimates.ResultsForty de-identified 2D DECT images from SSc patients with clinically evident finger CC lesions were obtained and divided into training (N = 30 with image rotation × 3 to expand the set to N = 120) and test sets (N = 10). In the training set, five hundred epochs (iterations) were required to train the CNN algorithm to segment phalanges from adjacent CC, and accurate segmentation was evaluated using the ten held-out images. To test model performance, CC lesional area estimates calculated by two independent radiologists and a computer scientist were compared (radiologist 1 vs. radiologist 2 and radiologist 1 vs. computer vision approach) using an independent test dataset comprised of 31 images (8 index finger and 23 other fingers). For the two radiologists’, and the radiologist vs. computer vision measurements, Spearman’s rho was 0.91 and 0.94, respectively, both p < 0.0001; Lin’s concordance correlation coefficient was 0.91 (95% CI 0.85–0.98, p < 0.001) and 0.95 (95% CI 0.91–0.99, p < 0.001); and Bland-Altman plots demonstrated a mean difference between radiologist vs. radiologist, and radiologist vs. computer vision area estimates of − 0.5 mm2 (95% limits of agreement − 10.0–9.0 mm2) and 1.7 mm2 (95% limits of agreement − 6.0–9.5 mm2, respectively.ConclusionsWe demonstrate that CNN quantification has a high degree of correlation with expert radiologist measurement of finger CC area measurements. Future work will include segmentation of 3-dimensional (3D) images for volumetric and density quantification, as well as validation in larger, independent cohorts.