AbstractThe main goal of this article is to test the long‐term performance of the three‐dimensional non‐local turbulence (NLT) parameterization scheme at different grid sizes in the so‐called gray zone between classical mesoscale modeling ( several km) and large eddy simulations (LES: several 100 m). For this, NLT has been implemented in the numerical weather prediction Icosahedral Nonhydrostatic model (ICON) of Deutscher Wetterdienst (DWD). Results are compared with a one‐dimensional version of NLT (NLT) and with two operational turbulence schemes available in ICON. Comparisons with observations from radiosondes, the operational surface synoptic (SYNOP) station network, and RAdar‐OnLine‐ANeichung (RADOLAN) radar data of DWD indicate that all turbulence schemes investigated perform reasonably well. Nonetheless, a more detailed study of the model results reveals several interesting differences between the turbulence parameterizations to be discussed in detail. Median absolute errors (MAE) from point‐to‐point comparisons between numerical results and SYNOP observations tend to be smaller than those from comparisons with averaging simulated fields over an environment around each station location. This behavior indicates an information loss caused by the averaging process. For the 2‐m temperature () and the hourly precipitation sums (), MAEs decrease with decreasing grid sizes, thus suggesting an information gain for finer grids. The nighttime MAEs of and obtained with NLT and NLT are similar to or lower than those of the operational turbulence schemes of ICON. Moreover, during a shallow warm‐air intrusion, NLT and especially NLT yield a more realistic representation of the horizontal structures of and, during nighttime stable boundary‐layer situations, also . Radiosonde profiles of the potential temperature confirm a reasonable vertical mixing as obtained with NLT and NLT.