Abstract. The AR6 Scenarios Database is a vital repository of climate change mitigation pathways used in the latest Intergovernmental Panel on Climate Change (IPCC) assessment cycle. In its current version, many scenarios in the database lack information about the level of anthropogenic carbon dioxide (CO2) removal via land sinks, as net-negative CO2 emissions and gross removals on land are not always separated and are not consistently reported across models. This makes scenario analyses focusing on CO2 removal challenging. We test and compare the performance of different regression models to impute missing data on land carbon sequestration for the global level and for several sub-global macro-regions from available data on net CO2 emissions from agriculture, forestry, and other land uses. We find that a k-nearest neighbors regression performs best among the tested regression models and use it to impute and provide two publicly available imputation datasets (https://doi.org/10.5281/zenodo.13373539, Prütz et al., 2024) on CO2 removal via land sinks for incomplete global scenarios (n=404) and incomplete regional R10 scenario variants (n=2358) of the AR6 Scenarios Database. We discuss the limitations of our approach, the use of our datasets for secondary assessments of AR6 scenario ensembles, and how this approach compares to other recent AR6 data reanalyses.
Read full abstract