AimTo examine the reliability of ChatGPT in evaluating the quality of medical content of the most watched videos related to urological cancers on YouTube. Material and methodsIn March 2024 a playlist was created of the first 20 videos watched on YouTube for each type of urological cancer. The video texts were evaluated by ChatGPT and by a urology specialist using the DISCERN-5 and Global Quality Scale (GQS) questionnaires. The results obtained were compared using the Kruskal-Wallis test. ResultsFor the prostate, bladder, renal, and testicular cancer videos, the median (IQR) DISCERN-5 scores given by the human evaluator and ChatGPT were (Human: 4 [1], 3 [0], 3 [2], 3 [1], P = .11; ChatGPT: 3 [1.75], 3 [1], 3 [2], 3 [0], P = .4, respectively) and the GQS scores were (Human: 4 [1.75], 3 [0.75], 3.5 [2], 3.5 [1], P = .12; ChatGPT: 4 [1], 3 [0.75], 3 [1], 3.5 [1], P = .1, respectively), with no significant difference determined between the scores. The repeatability of the ChatGPT responses was determined to be similar at 25 % for prostate cancer, 30 % for bladder cancer, 30 % for renal cancer, and 35 % for testicular cancer (P = .92). No statistically significant difference was determined between the median (IQR) DISCERN-5 and GQS scores given by humans and ChatGPT for the content of videos about prostate, bladder, renal, and testicular cancer (P > .05). ConclusionAlthough ChatGPT is successful in evaluating the medical quality of video texts, the results should be evaluated with caution as the repeatability of the results is low.