Abstract Background Polypharmacy (PP), which rises with age, is a growing public health challenge, affecting health outcomes and escalating healthcare expenditure. Leveraging the comprehensive European coverage and longitudinality of the Survey of Health, Ageing, and Retirement in Europe (SHARE) study, we aim to assess PP prevalence patterns and identify the most effective machine learning (ML) model for predicting long-term PP risk, as a foundation for a scalable predictive tool. Methods We used data from participants aged above 50 who were present in wave 6 and at least one of the subsequent three waves of the SHARE study, aiming to predict PP risk at 2, 4, and 6-year intervals. PP was defined as the concurrent use of five or more medications. We analyzed PP prevalence trends and selected the predictor variables from wave 6 using LASSO regression analysis. We evaluated eight ML models, namely, ANN, SVM, DT, RF, GB, XGBoost, LightGBM, and CatBoost, using a rigorous cross-validation strategy to ensure robustness and reliability. Results Our analysis reveals an upward trend in PP prevalence across the surveyed countries, with aggregate figures rising from 34.03% (95% CI 33.1-34.9) in wave 7 to 36.75% (95% CI 35.6-37.9) in wave 8, reaching 39.91 (95% CI 38.9-40.9) in wave 9. Additionally, using the Categorical Boosting ML model to predict PP resulted in overall accuracies of 75.08%, 73.7%, and 71.65% and recall rates of 72.83%, 70.48%, and 67.96% for the 2, 4, and 6-year intervals, respectively. Conclusions This study reveals a rising trend of PP across European countries and demonstrates the potential of using longitudinal data and ML to enhance PP prediction. The tool developed represents a step forward in risk stratification, which would be particularly beneficial in practical settings where family physicians or pharmacists could employ the tool to monitor elderly patients, predict and thus prevent PP, and reduce the negative health and economic impact associated with it. Key messages • The integration of machine learning with longitudinal data presents a significant advance in polypharmacy risk prediction. • The tool we developed highlights the potential translational impact of our findings.