Summary Simple rainfall–runoff models used in the assessment of land-use change and to support forest water management are subjected to a selection process which scrutinises their veracity and integrity. Veracity, the ability of a model to provide meaningful information is assessed using performance criteria, incorporating: a popular mean square error (MSE) approach; empirical distribution functions and information criteria. Integrity, a model’s plausibility reflected in its ability to extract information from data, is assessed using a Bayesian approach. A delayed rejection, adaptive Metropolis algorithm is used with a generalised likelihood to calibrate the models. Predictive uncertainty is assessed using a split sample procedure which uses high runoff data for calibration and drier data for validation. A simple multiplicative latent variable is used to accommodate input uncertainty in rainfall data, enabling a distinction to be made between uncertainty associated with data, parameters and the models themselves. The study demonstrates: the focus provided by setting model evaluation in a philosophical context; the benefits of using a more meaningful range of performance criteria than MSE-based approaches and the insights into integrity provided by Bayesian analyses. A hyperbolic tangent model is selected as the best of five candidates for its superior veracity and integrity under Australian conditions. Models with extensive application in South Africa, Australia and USA are rejected. Challenges to applying this approach in water management are identified in the pragmatic nature of the sector, its capacity constraints and a tendency of researchers to place confidence in accepted methods at the expense of rigour.