Abstract Background Machine learning (ML) models provide potential advantage over ‘traditional’ regression models in heart failure (HF) prediction. Objective To compare performances of Cox PH models and ML survival models for incident HF in men and women without prevalent ischemic heart disease (IHD). We also aimed to identify potential high-risk precursors otherwise ignored by conventional survival models, and to investigate differences between sex-specific models. Methods We included 476,393 participants (55.6% women) from the UK Biobank, after excluding participants with a history of HF or IHD, and defined sex-specific datasets. We predicted incident HF events using over 400 baseline characteristics. We constructed multivariable Cox PH models, which included all predictor variables and subsequently only those remaining after LASSO stability selection. We also developed two supervised ML models (Random Survival Forest (RSF), eXtreme Gradient Survival Boosting (XGBoost)). We identified the 15 most important sex-specific predictors in each model and performances were compared using the C-index. Models were validated using hold-out sets. Results During 12.3 ± 1.9 years of follow-up, 4680 (1.76%) women and 6631 (3.14%) men developed incident HF. XGBoost showed the best performance during model training (C-index, training: 0.89 in men, 0.97 in women; validation 0.77 in men, 0.80 in women). The multivariable Cox model performed second-best (C-index, training: 0.78 in men, 0.82 in women; validation: 0.76 in men, 0.78 in women). RSF performed slightly worse (C-index, training: 0.75 in men, 0.79 in women; validation: 0.75 in men, 0.79 in women) but did not show performance drop during validation. LASSO stability selection performed similar to RSF. Age, self-reported lifetime treatments and medications, cystatin-C, waist circumference and FEV1-scores were identified as strong risk factors in all models for both sexes. Reduced albumin levels and elevated HbA1c were more strongly associated with high risk in men, while elevated systolic BP showed higher importance in women. Traditional Cox models observed CRP as important only in men, while the ML models identified CRP as important for both sexes. Neutrophil count was considered a strong risk factor in both sexes in the traditional Cox models, yet it was not among the most important predictors in both ML models. Presence of other heart disease (which included a.o. pericardial disease, valve disorders and arrhythmias) was an important predictor variable only in the ML models. Conclusion ML models showed similar performance to Cox PH models for HF prediction. Despite this, differences in predictor importance were identified between models. Sex-specific risk predictors were found, and FEV1 score, which is not commonly included in existing models, was identified as an important risk factor. These results suggest that ML models may reveal additional insights that would otherwise remain unnoticed.