Machine-learning disease classification models have the potential to support diagnosis of various diseases. Pairing classification models with synthetic image generation may overcome barriers to developing classification models and permit their use in numerous contexts. Using 10 images of penises with human papilloma virus (HPV)-related disease, we trained a denoising diffusion probabilistic model. Combined with text-to-image generation, we produced 630 synthetic images, of which 500 were deemed plausible by expert clinicians. We used those images to train a Vision Transformer model. We assessed the model’s performance on clinical images of HPV-related disease (n = 70), diseases other than HPV (n = 70), and non-diseased images (n = 70), calculating recall, precision, F1-score, and Area Under the Receiver Operating Characteristics Curve (AUC). The model correctly classified 64 of 70 images of HPV-related disease, with a recall of 91.4% (95% CI 82.3%-96.8%). The precision of the model for HPV-related disease was 95.5% (95% CI 87.5%-99.1%), and the F1-score was 93.4%. The AUC for HPV-related disease was 0.99 (95% CI 0.98-1.0). Overall, the HPV-related disease classification model demonstrated excellent performance on clinical images, which was trained exclusively using synthetic images.