Abstract Background and Aims Exercising clinical judgement is becoming more challenging in the context of ageing populations, increasing prevalence of multimorbidity and frailty, and insufficient evidence regarding how best to manage increasing complexity in clinical practice. Measures of clinical judgement are surprisingly absent from health information systems despite their clear importance and the potential costs of poor decisions. Suitability for kidney transplantation is a high-stake clinical judgement, requiring consideration of complex biopsychosocial patient and organ-utilisation factors. This study aimed to test if it is feasible to assess clinical judgement quantitatively from routine clinical documentation in kidney transplant recipients. Method Participants were identified from a regional register of patients on haemodialysis, who underwent deceased donor kidney transplantation (KTR) and had at least 1 year of follow-up. We identified 15 KTR who had poor outcomes, defined as more than 6 hospital admissions or death within one-year of transplantation. Fifteen KTR controls were matched using the United Kingdom Renal Recipient Index (UKRRI: a composite score of age, dialysis requirement at listing, wait time on dialysis and diabetes status). Six blinded raters (two nephrologists, two transplant surgeons and two kidney transplant coordinators) scored clinical letters written at the time of transplant assessment across four domains (Clinician Belief of Appropriateness [CBA] for perceived suitability, perceived benefit, predicted short-term harm, predicted long-term harm: from 1 = worst, to 5 = best). The total CBA score was calculated as the mean score across the 4 domains. CBA scores were compared between raters using the intra-class correlation coefficient (ICC: for two-way agreement). Logistic regression models tested odds of poor outcome by CBA scores across 4 domains. Results Practicing clinicians were eager to participate, suggesting this type of research will be worthwhile and feasible. Among 30 KTR, the median total CBA transplant suitability score was 3 (IQR 3-4, range 1-5). Overall agreement was moderate (ICC for the total CBA score: 0.63, 95% CI 0.51-0.74; p < 0.001), but with substantial variation in CBA scores across the domains and between raters (Figure 1). In 3 of the 4 domains (perceived suitability: OR 0.91, 95% CI 0.48-1.73; predicted short-term: OR 0.70, 95% CI 0.33-1.51; and predicted long-term harms: OR 0.82, 95% CI 0.35-1.91), CBA scores adjusted for UKRRI were associated with poor outcome after transplantation in the expected direction. For predicted benefit, the point estimate was not in the expected direction; individuals with greater predicted benefits were more likely to have poor outcomes (OR 1.46, 95% CI 0.28-7.47). Unsurprisingly, given the small numbers in this pilot and the substantial influence of the donor organ on outcomes in kidney transplantation, all the confidence intervals were wide and crossed the null, indicating considerable uncertainty. Conclusion In patients undergoing assessment for kidney transplantation, measuring clinical judgement retrospectively is feasible, though subject to variability across domains and between raters. We did not identify any clear associations between retrospective assessment of CBA and outcome after kidney transplantation in this small pilot. In future work, we will develop and validate measures of clinical judgement, collect these measures prospectively, and examine the relationship of measured clinical judgement with health outcomes.