Abstract Background The prediction and early detection of heart failure (HF) is vital to reduce its substantial impact on quality of life, survival and healthcare expenditure. Current incident HF risk stratification strategies achieve only modest performance. Moreover, state-of-the-art risk scores focus on commonly acknowledged clinical risk factors, thus necessitating the integration of heterogenous data sources and prioritizing individuals with obvious risk constellations. Purpose Here, we explored the predictive value of serum metabolomics (168 metabolites detected by proton nuclear magnetic resonance (1H-NMR) spectroscopy) for incident HF. Methods Leveraging data of n = 68,311 individuals and > 0.8 million person-years of follow-up from the UK Biobank (UKB) cohort, we (I) evaluated the association of individual metabolites with incident HF via fitting of per-metabolite COX proportional hazards models and (II) trained and validated elastic net (EN) models to predict incident HF using the serum metabolome. Discriminative performance was benchmarked against a comprehensive, well-validated clinical risk score (Pooled Cohort Equations to Prevent HF, PCP-HF). External validation was conducted in the independent FINRISK study. Results Several metabolites were independently associated with incident HF (90/168 adjusting for age and sex, 48/168 adjusting for PCP-HF). Performance-optimized risk models effectively retained key predictors representing highly correlated clusters (≈ 80 % feature reduction). The addition of metabolomics to PCP-HF improved predictive performance (Harrel’s C: 0.768 vs. 0.755, ΔC = 0.013 (95% confidence interval (CI) 0.004 – 0.022), continuous net reclassification improvement (NRI) = 0.287 (95% CI 0.200 – 0.367), relative integrated discrimination improvement (IDI): 17.47 % (95% CI 9.463 - 27.825)). Simplified models only including age, sex and metabolomics performed almost as well as the PCP-HF model (Harrel’s C: 0.745 vs. 0.755, ΔC = 0.010 (95% CI -0.004 - 0.027), continuous NRI: 0.097 (95% CI -0.025 - 0.217), relative IDI: 13.445 % (95% CI -10.608 - 41.454)). Risk and survival stratification was improved by integrating metabolomics. External validation within FINRISK revealed similar patterns using the original model coefficients. Conclusions Serum metabolomics improve the prediction of incident HF risk. Scores obtained from the combination of age, sex and metabolomics exhibit similar predictive power as clinical risk models. Offering serum metabolomics to a middle-aged cohort might significantly improve HF risk stratification, thus facilitating preventive efforts. Via a single blood draw, serum metabolomics displays a highly cost- and time-effective, standardizable, and scalable alternative to clinical risk scores involving physical measurements and history taking in addition to clinical chemistry.Relative performance (test split)ROC & DCA plots (test split)