Abstract. Within the framework of the AeroCom (Aerosol Comparisons between Observations and Models) initiative, the state-of-the-art modelling of aerosol optical properties is assessed from 14 global models participating in the phase III control experiment (AP3). The models are similar to CMIP6/AerChemMIP Earth System Models (ESMs) and provide a robust multi-model ensemble. Inter-model spread of aerosol species lifetimes and emissions appears to be similar to that of mass extinction coefficients (MECs), suggesting that aerosol optical depth (AOD) uncertainties are associated with a broad spectrum of parameterised aerosol processes. Total AOD is approximately the same as in AeroCom phase I (AP1) simulations. However, we find a 50 % decrease in the optical depth (OD) of black carbon (BC), attributable to a combination of decreased emissions and lifetimes. Relative contributions from sea salt (SS) and dust (DU) have shifted from being approximately equal in AP1 to SS contributing about 2∕3 of the natural AOD in AP3. This shift is linked with a decrease in DU mass burden, a lower DU MEC, and a slight decrease in DU lifetime, suggesting coarser DU particle sizes in AP3 compared to AP1. Relative to observations, the AP3 ensemble median and most of the participating models underestimate all aerosol optical properties investigated, that is, total AOD as well as fine and coarse AOD (AODf, AODc), Ångström exponent (AE), dry surface scattering (SCdry), and absorption (ACdry) coefficients. Compared to AERONET, the models underestimate total AOD by ca. 21 % ± 20 % (as inferred from the ensemble median and interquartile range). Against satellite data, the ensemble AOD biases range from −37 % (MODIS-Terra) to −16 % (MERGED-FMI, a multi-satellite AOD product), which we explain by differences between individual satellites and AERONET measurements themselves. Correlation coefficients (R) between model and observation AOD records are generally high (R>0.75), suggesting that the models are capable of capturing spatio-temporal variations in AOD. We find a much larger underestimate in coarse AODc (∼ −45 % ± 25 %) than in fine AODf (∼ −15 % ± 25 %) with slightly increased inter-model spread compared to total AOD. These results indicate problems in the modelling of DU and SS. The AODc bias is likely due to missing DU over continental land masses (particularly over the United States, SE Asia, and S. America), while marine AERONET sites and the AATSR SU satellite data suggest more moderate oceanic biases in AODc. Column AEs are underestimated by about 10 % ± 16 %. For situations in which measurements show AE > 2, models underestimate AERONET AE by ca. 35 %. In contrast, all models (but one) exhibit large overestimates in AE when coarse aerosol dominates (bias ca. +140 % if observed AE < 0.5). Simulated AE does not span the observed AE variability. These results indicate that models overestimate particle size (or underestimate the fine-mode fraction) for fine-dominated aerosol and underestimate size (or overestimate the fine-mode fraction) for coarse-dominated aerosol. This must have implications for lifetime, water uptake, scattering enhancement, and the aerosol radiative effect, which we can not quantify at this moment. Comparison against Global Atmosphere Watch (GAW) in situ data results in mean bias and inter-model variations of −35 % ± 25 % and −20 % ± 18 % for SCdry and ACdry, respectively. The larger underestimate of SCdry than ACdry suggests the models will simulate an aerosol single scattering albedo that is too low. The larger underestimate of SCdry than ambient air AOD is consistent with recent findings that models overestimate scattering enhancement due to hygroscopic growth. The broadly consistent negative bias in AOD and surface scattering suggests an underestimate of aerosol radiative effects in current global aerosol models. Considerable inter-model diversity in the simulated optical properties is often found in regions that are, unfortunately, not or only sparsely covered by ground-based observations. This includes, for instance, the Sahara, Amazonia, central Australia, and the South Pacific. This highlights the need for a better site coverage in the observations, which would enable us to better assess the models, but also the performance of satellite products in these regions. Using fine-mode AOD as a proxy for present-day aerosol forcing estimates, our results suggest that models underestimate aerosol forcing by ca. −15 %, however, with a considerably large interquartile range, suggesting a spread between −35 % and +10 %.