Accurate estimations of actual evapotranspiration (ET) are key in a variety of water balance applications, but divergent results can be obtained due to the large range of available methodologies. The use of an ensemble approach is a suitable alternative, as it summarizes multiple sources in an optimized strategy. In this study, an expert-based multi-step collocation (MC) approach is tested to merge six ET datasets, with the aim of reconstructing a spatiotemporally-consistent monthly dataset for Italy in the climatological period 1991–2020. The merged products are: three water balance datasets (BIG BANG, LSA SAF, and LISFLOOD), two residual surface energy balance model datasets (SSEBop, and ALEXI), and the MODIS standard ET product. The merged product is analyzed for spatio-temporal consistency and evaluated using flux observations from 11 sites. On average, the merged product has higher accuracy (mean absolute difference = 0.47 ± 0.17 mm/d, relative difference = 27.9 ± 7.5 %) than any single base dataset, and it is characterized by limited bias (mean bias error = -0.17 ± 0.26 mm/d), high correlation (r = 0.83 ± 0.10), and more uniform performance across sites. The merged ET dataset is accompanied by an estimation of the ensemble spread, which highlights large differences in ET estimates in some areas and periods characterized by severe water stress, such as in southern Italy during the summer. This large spread seems to be mostly driven by systematic differences among datasets, which affect the estimation of the reference climatology, suggesting how inter-model spread can have a defining role in further improving the merging strategies.