Abstract Background Determining the etiology of heart failure (HF) is crucial for appropriate treatment. Endomyocardial biopsy (EMB) is an invasive procedure that complements clinical assessment and imaging in diagnosing various cardiac disorders. The diagnostic value of EBM depends upon the anticipated yield of the procedure. Whether results from non-invasive examinations may predict the diagnostic yield from EMB is unclear. Purpose We aimed to investigate the relative importance of clinical and instrumental features in predicting a diagnostic EMB result, using a machine learning (ML) approach. Methods We retrospectively examined 665 consecutive adult patients with HF symptoms who underwent right EBM as part of their diagnostic work-up. The primary outcome was a positive biopsy result with a clear diagnosis. The ML algorithm consisted of Gradient Boosting Machines (GBM), with computation of relative variable importance using conditional importance. We calculated the conditional variable importance for 53 clinical or instrumental variables. Model performance was evaluated using a receiver operating characteristic curve, which was compared to a logistic regression model using the same variables. Validation was performed using a 10-fold cross-validation model. Results Among the 665 patients (50±14 years, 67% male), the EMB provided a diagnosis in 138 cases (20.8%). Patients attaining diagnosis were more often older, had a shorter disease duration, presented with ventricular arrhythmia or cardiac arrest, had a pacemaker or defibrillator, displayed higher creatinine, NT-proBNP and high-sensitive troponin T levels, and had smaller ventricular volumes with higher ejection fraction at CMR. All patients with a diagnostic EMB showed late gadolinium enhancement (LGE) on cardiac magnetic resonance (CMR) in the right ventricle, the right atrium or in the left atrium. The relative importance of different variables predicting outcomes is presented in Figure 1. The top six included four distinct CMR features and two circulating biomarkers (NT-proBNP and high-sensitive troponin T). LGE in the right ventricle and the presence of left ventricular hypertrophy were stronger predictors compared to the remaining variables. The GBM model performed significantly better than the logistic regression model (AUC GBM 0.98, AUC logistic regression 0.93, DeLong’s test p<0.001) (Figure 2). Conclusion Using a ML algorithm, CMR LGE, left ventricular hypertrophy, and elevated cardiac biomarkers, stand out as strong overall predictors of an endomyocardial biopsy with a high diagnostic yield.Figure 1Figure 2