BackgroundPrimary sclerosing cholangitis (PSC) patients have a risk of developing cholangiocarcinoma (CCA). Establishing predictive models for CCA in PSC is important.MethodsIn a large cohort of 1,459 PSC patients seen at Mayo Clinic (1993–2020), we quantified the impact of clinical/laboratory variables on CCA development using univariate and multivariate Cox models and predicted CCA using statistical and artificial intelligence (AI) approaches. We explored plasma bile acid (BA) levels’ predictive power of CCA (subset of 300 patients, BA cohort).ResultsEight significant risk factors (false discovery rate: 20%) were identified with univariate analysis; prolonged inflammatory bowel disease (IBD) was the most important one. IBD duration, PSC duration, and total bilirubin remained significant (p < 0.05) with multivariate analysis. Clinical/laboratory variables predicted CCA with cross-validated C-indexes of 0.68–0.71 at different time points of disease, significantly better compared to commonly used PSC risk scores. Lower chenodeoxycholic acid, higher conjugated fraction of lithocholic acid and hyodeoxycholic acid, and higher ratio of cholic acid to chenodeoxycholic acid were predictive of CCA. BAs predicted CCA with a cross-validated C-index of 0.66 (std: 0.11, BA cohort), similar to clinical/laboratory variables (C-index = 0.64, std: 0.11, BA cohort). Combining BAs with clinical/laboratory variables leads to the best average C-index of 0.67 (std: 0.13, BA cohort).ConclusionsIn a large PSC cohort, we identified clinical and laboratory risk factors for CCA development and demonstrated the first AI based predictive models that performed significantly better than commonly used PSC risk scores. More predictive data modalities are needed for clinical adoption of these models.