Predictions of soil hydraulic properties by pedotransfer functions (PTFs) must be treated with caution when they are used in an application domain which differs from the domain of their original development and calibration. However, in some settings, scientists may have little alternative but to use PTFs calibrated elsewhere. In this paper we consider how legacy data can be used to evaluate PTFs in new regions, paying particular attention to the challenges that arise when, as is often the case, the legacy data are not obtained by independent random sampling, and may be clustered at multiple scales. We undertook this work in southern Africa (Zimbabwe, Zambia and Malawi) where PTFs have been little-used, despite the scarcity of direct measurements of the soil properties of interest. We evaluated the extent to which existing PTFs provide a useful tool for the prediction of soil moisture content at field-capacity (−33 kPa) and permanent wilting-point (−1500 kPa) at different spatial scales. Soil legacy data for Zambia, Zimbabwe and Malawi were collated from various sources and PTFs from temperate and tropical domains were evaluated. We examined error variance components of predictions at within-profile, within-site and between-site scales; and estimated their mean errors. In general the better-performing PTFs (with respect to bias and the size of the error variance components) were ones calibrated with data from a tropical domain. This was most apparent at −1500 kPa. However, not all PTFs calibrated with data on tropical soils performed well, and predictions from some PTFs calibrated over a temperate domain were better at −33 kPa. The observations were spatially clustered, with data from different depth intervals in the same profile, from profiles in the same experimental site or farm, and from clusters across the region. This enabled us to show, with an appropriate mixed model analysis, that PTFs which effectively capture regional-scale variation may be less useful for predicting variation within a profile. We propose that such studies, based on legacy data, and with a suitable linear mixed model, should be used to screen PTFs of any provenance before their wider application.