Abstract Urban water demand (UWD) forecasting is essential for water supply network optimization and management, both in business-as-usual scenarios, as well as under external climate and socio-economic stressors. Different machine learning and deep learning models have shown promising forecasting skills in various areas of application. However, their potential to forecast multi-step ahead UWD has not been fully explored. Modelling uncertain UWD patterns and accounting for variations in water demand behaviors require techniques that can extract time-varying information and multi-scale changes. In this research, we comparatively investigate different state-of-the-art machine learning- and deep learning-based predictive models on 1-day- and 7-day-ahead UWD forecasting, using daily demand data from the city of Milan, Italy. The contribution of this paper is two-fold. First, we compare the forecasting performance of different machine learning and deep learning models on single- and multi-step daily UWD forecasting. These models include Artificial Neural Network (ANN), Support Vector Regression (SVR), Light Gradient Boosting Machine (LightGBM), and Long Short-Term Memory network with and without an attention mechanism (LSTM and AM-LSTM). We benchmark their prediction accuracy against autoregressive time series models. Second, we investigate the potential enhancement in predictive accuracy by incorporating the wavelet transform and feature selection performed by LightGBM into these models. Results show that, overall, wavelet-enhanced feature selection improves the model predictive performance. The hybrid model combining wavelet-enhanced feature selection via LightGBM with LSTM (WT-LightGBM-(AM)-LSTM) can achieve high levels of accuracy with Nash-Sutcliffe Efficiency larger than 0.95 and Kling–Gupta Efficiency higher than 0.93 for both 1-day- and 7-day-ahead UWD forecasts. Furthermore, performance is shown to be robust under the influence of external stressors causing sudden changes in UWD.