IntroductionCone-beam computed tomography (CBCT) is widely used to detect jaw lesions, although CBCT interpretation is time-consuming and challenging. Artificial intelligence for CBCT segmentation may improve lesion detection accuracy. However, consistent automated lesion detection remains difficult, especially with limited training data. This study aimed to assess the applicability of pretrained transformer-based architectures for semantic segmentation of CBCT volumes when applied to periapical lesion detection. MethodsCBCT volumes (n = 138) were collected and annotated by expert clinicians using 5 labels – "lesion," "restorative material," "bone," "tooth structure," and "background." U-Net (convolutional neural network-based) and Swin-UNETR (transformer-based) models, pretrained (Swin-UNETR-PRETRAIN), and from scratch (Swin-UNETR-SCRATCH), were trained with subsets of the annotated CBCTs. These models were then evaluated for semantic segmentation performance using the Sørensen–Dice coefficient (DICE), lesion detection performance using sensitivity and specificity, and training sample size requirements by comparing models trained with 20, 40, 60, or 103 samples. ResultsTrained with 103 samples, Swin-UNETR-PRETRAIN achieved a DICE of 0.8512 for "lesion," 0.8282 for "restorative materials," 0.9178 for "bone," 0.9029 for "tooth structure," and 0.9901 for "background." “Lesion” DICE was statistically similar between Swin-UNETR-PRETRAIN trained with 103 and 60 images (P > .05), with the latter achieving 1.00 sensitivity and 0.94 specificity in lesion detection. With small training sets, Swin-UNETR-PRETRAIN outperformed Swin-UNETR-SCRATCH in DICE over all labels (P < .001 [n = 20], P < .001 [n = 40]), and U-Net in lesion detection specificity (P = .006 [n = 20], P = .031 [n = 40]). ConclusionsTransformer-based Swin-UNETR architectures allowed for excellent semantic segmentation and periapical lesion detection. Pretrained, it may provide an alternative with smaller training datasets compared to classic U-Net architectures.
Read full abstract