Evaluation of the correlation and agreement between AI and semi-automatic evaluations of calcium scoring CT (CSCT) examinations using extensive data from the Swedish CardioPulmonary bio-Image study (SCAPIS). In total, 5057 CSCT examinations were performed on one CT system at Linköping University Hospital between October 8, 2015, and June 12, 2018. AI evaluations were compared to semi-automatic CSCT results from expert reader evaluations rendered within SCAPIS. Pearson correlation, intraclass correlation coefficients (ICC), and Bland-Altman analysis were applied for Agatston (AS), volume (VS), mass scores (MS), number of lesions and lesion location. Agreement of Agatston score classifications into cardiovascular (CV) risk categories was evaluated with weighted kappa analysis. The evaluation included 4567 subjects, 2229 (48.8%) male, 2338 (51.2%) female, 50-64 years of age (mean 57.3 ± 4.4). The AS ranged from 0 to 2871 in the cohort, with 2846 subjects having an AS of 0. Mean and median AS were 51.4 and 0.0, respectively. Total AS, VS, MS and number of lesions ICCs were 0.994, 0.994, 0.994, 0.960 (p < 0.001), respectively. Bland-Altman analyses rendered mean differences ± 1.96 SD upper and lower limits of agreement for AS -0.04, -52.5 to 52.4, VS -0.44, -46.51 to 45.63, and MS -0.07, -9.62 to 9.48. Weighted kappa analysis for CV risk category classifications was 0.913, and overall accuracy was 91.2%. There was excellent correlation and agreement between AI and semi-automatic evaluations for all calcium scores, number of lesions and lesion location. High degrees of agreement and accuracy were found for the CV risk categorization. Question Can AI function as a tool for enhancing the efficiency and accuracy of Coronary Artery Calcium Score (CACS) evaluations in clinical radiology practice? Findings This study confirms the robustness of AI-derived CACS results across extensive datasets, though its generalizability is limited by data acquisition from a single CT system. Clinical relevance This study suggests that AI holds significant promise as a tool for enhancing the efficiency and accuracy of CACS evaluations, with implications for improving patient diagnostics and reducing radiologist workload in clinical practice.