ObjectivesCompare the accuracy and diagnostic concordance of three commercially available AI-based lateral cephalometric tracing software.Materials and methodsSixty-three lateral cephalometric radiographs were analyzed using semi-automatic (Dolphin Imaging Systems LLC) and AI-based software programs (WebCeph™, Cephio, and Ceppro DDH Inc.). Intra- and inter-observer reliability were assessed for human expert measurements, and repeated-measures one-way ANOVA was used to compare the AI and human expert measurements. The diagnostic performance was evaluated using sensitivity and specificity tests.ResultsHuman expert reliability was excellent (ICC > 0.9) for most cephalometric parameters. Compared to human experts, significant differences were observed for all three AI-based cephalometric programs (WebCeph™ – 10 of 11, Cephio – 7 of 11, and Ceppro DDH Inc. – 7 of 11 cephalometric measurements). Variations exceeding two units were noted for most parameters, and differences in defining the sagittal and vertical skeletal patterns, dental, and soft tissue characteristics were observed.ConclusionAll three AI-based tracing programs showed inaccuracies compared to human expert measurements and lacked reliability in measuring key cephalometric parameters. Clinicians should exercise caution when relying solely on AI-based analyses for orthodontic treatment planning and assessment.