BackgroundPrimary care data in the UK are widely used for cancer research, but the reliability of recording key events like diagnoses remains uncertain. Although data linkage can improve reliability, its costs, time requirements, and sample size constraints may discourage its use. We evaluated accuracy, completeness, and date concordance of prostate cancer (PCa) diagnosis recording in Clinical Practice Research Datalink (CPRD) GOLD and Aurum compared to linked Cancer Registry (CR) and Hospital Episode Statistics (HES) Admitted Patient Care (APC) in England. MethodsIncident PCa diagnoses (2000–2016) for males aged ≥46 at diagnosis who remained registered with their General Practitioner (GP) by age 65 and were recorded in at least one data source were analysed. Accuracy was the proportion of diagnoses recorded in GOLD or Aurum with a corresponding record in CR or HES. Completeness was the proportion of CR or HES diagnoses with a corresponding record in GOLD or Aurum. ResultsThe final cohorts for comparisons included 29,500 records for GOLD and 26,475 for Aurum. Compared to CR, GOLD was 86 % accurate and 65 % complete, while Aurum was 87 % accurate and 77 % complete. Compared to HES, GOLD was 76 % accurate and 60 % complete, and Aurum was 79 % accurate and 70 % complete. Concordance in diagnosis dates improved over time in both GOLD and Aurum, with 93 % of diagnoses recorded within a year compared to CR, and 66 % (GOLD) and 71 % (Aurum) compared to HES. Delays of 2–3 weeks in primary care diagnosis recording were observed compared to CR, whereas most diagnoses appeared at least 3 months earlier in primary care than in HES. ConclusionsAurum demonstrated better accuracy and completeness for PCa diagnosis recording than GOLD. However, linkage to HES or CR is recommended for improved case capture. Researchers should address the limitations of each data source to ensure research validity.
Read full abstract