AbstractUrban drainage models (UDMs) are often used to manage urban flooding. However, these models generally involve many parameters to represent the underlying complex hydrodynamic processes. This results in significant challenges to achieving effective and robust model calibration especially with frequently limited observations, leading to unreliable model predictions. This paper makes the first attempt at UDM calibration using the Bayesian‐based Ensemble Smoother (ES) method. Three ES variants are considered, that is, the primary ES, the versions with multiple data assimilation (ES‐MDA) and iterative local update (ES‐ILU). Two synthetic cases and one real‐world application with up to 5,236 calibration parameters are tested. Results obtained show that: (a) both ES‐MDA and ES‐ILU can produce effective model calibration with ES‐ILU outperforming ES‐MDA in terms of both accuracy and uncertainty while ES exhibits limited performance; (b) for the real‐world case, both the ES‐MDA and ES‐ILU methods provide better calibration results than the best‐known solution manually obtained, (c) a minimum number of observations are required to enable an overall accurate model calibration (e.g., four and ten more monitoring sites are needed in the two synthetic cases); and (d) the model calibrated using an intense rainfall event is generally robust to make reliable predictions across different rainfall events while the model calibrated using less intense rainfall event does not perform well for more intense rainfall events. It was also found that ubiquitous parameter equifinality significantly hinders unique parameter identification even when overall accurate state estimates are obtained. This should be clearly understood in practical applications.