Abstract Seasonal-to-multidecadal applications that require ocean surface energy fluxes often require accuracies of surface turbulent fluxes to be 5 W m−2 or better. While there is little doubt that uncertainties in the flux algorithms and input data can cause considerable errors, the impact of temporal averaging has been more controversial. The biases resulting from using monthly averaged winds, temperatures, and humidities in the bulk aerodynamic formula (i.e., the so-called classical method) to estimate the monthly mean latent heat fluxes are shown to be substantial and spatially varying in a manner that is consistent with most prior work. These averaging-related biases are linked to nonnegligible submonthly covariances between the wind, temperature, and humidity. To provide additional insight into the averaging-related bias, the methodology behind the third-generation Florida State University monthly mean surface flux product (FSU3) is detailed to highlight additional sources of errors in gridded datasets. The FSU3 latent heat fluxes suffer from this averaging-related bias, which can be as large as 90 W m−2 in western boundary current regions during winter and can exceed 40 W m−2 in synoptically active portions of the tropics. The regional impacts of these biases on the mixed layer temperature tendency are shown to demonstrate that the error resulting from applying the classical method is physically substantial.