Diabetic peripheral neuropathy (DPN) is common and can go unnoticed until it is firmly developed. This study aims to establish a transformer-based deep learning algorithm (DLA) to classify corneal confocal microscopy (CCM) images, identifying DPN in diabetic patients. Our classification model differs from traditional convolutional neural networks (CNNs) using a Swin transformer network with a hierarchical architecture backbone. Participants included those with (DPN+, n = 57) or without (DPN-, n = 37) DPN as determined by the updated Toronto consensus criteria. The CCM image dataset (consisting of 570 DPN+ and 370 DPN- images, with five images selected from each participant's left and right eyes) was randomly divided into training, validation, and test subsets at a 7:1:2 ratio, considering individual participants. The effectiveness of the algorithm was assessed using diagnostic accuracy measures, such as sensitivity, specificity, and accuracy, in conjunction with Grad-CAM visualization techniques to interpret the model's decisions. In the DPN + group (n = 12), the transformer model successfully predicted all participants, while in the DPN- group (n = 7), one participant was misclassified as DPN+, with an area under the curve (AUC) of 0.9405 (95% CI 0.8166, 1.0000). Among the DPN + images (n = 120), 117 were correctly classified, and among the DPN- images (n = 70), 49 were correctly classified, with an AUC of 0.8996 (95% CI 0.8502, 0.9491). For single-image predictions, the transformer model achieved a superior AUC relative to the ResNet50 model (0.8761, 95% CI 0.8155, 0.9366), the Inception_v3 model (0.8802, 95% CI 0.8231, 0.9374), and the DenseNet121 model (0.8965, 95% CI 0.8438, 0.9491). Transformer-based networks outperform CNN-based networks in rapid binary DPN classification. Transformer-based DLAs have clinical DPN screening potential.