ObjectivesTo train, test and externally validate a prediction model that supports General Practitioners (GPs) in early identification of patients at risk of developing symptom diagnoses that persist for more than a year. MethodsWe retrospectively collected and selected all patients having episodes of symptom diagnoses during the period 2008 and 2021 from the Family Medicine Network (FaMe-Net) database. From this group, we identified symptom diagnoses that last for less than a year and symptom diagnoses that persist for more than a year. Multivariable logistic regression analysis using a backward selection was used to assess which factors were most predictive for developing symptom diagnoses that persist for more than a year. Performance of the model was assessed using calibration and discrimination (AUC) measures. External validation was tested using data between 2018 and 2022 from AHON-registry, a primary care electronic health records data registry including 73 general practices from the north and east regions of the Netherlands and about 460,795 patients. ResultsFrom the included 47,870 patients with a symptom diagnosis in the FaMe-Net registry, 12,481 (26.1%) had a symptom diagnosis that persisted for more than a year. Older age (≥ 75 years: OR = 1.30, 95% CI [1.19, 1.42]), having more previous symptom diagnoses (≥ 3: 1.11, [1.05, 1.17]) and more contacts with the GP over the last 2 years (≥ 10 contacts: 5.32, [4.80, 5.89]) were predictive of symptom diagnoses that persist for more than a year with a marginally acceptable discrimination (AUC 0.70, 95% CI [0.69–0.70]). The external validation showed poor performance with an AUC of 0.64 ([0.63–0.64]). ConclusionA clinical prediction model based on age, number of previous symptom diagnoses and contacts might help the GP to early identify patients developing symptom diagnoses that persist for more than a year. However, the performance of the original model is limited. Hence, the model is not yet ready for a large-scale implementation.