Introduction Extreme wildfires are increasingly prevalent worldwide, driving significant forest area loss and severe environmental and socioeconomic impacts (Cunningham et al. 2024). The Mediterranean, in particular, is projected to face heightened fire risks due to climate change-induced drier conditions and lower fuel moisture (de Rivera et al. 2020). However, the drivers of extreme wildfires remain poorly understood. Current fire models, typically calibrated on global fire datasets, are primarily designed to estimate annual total burned areas and struggle to capture the unique behaviours of extreme wildfires (Forrest et al. 2024). Furthermore, correlation-based approaches, which dominate current modelling efforts, may fail to identify the underlying causal drivers of these events and are poorly suited for extrapolation to changing conditions. Causal discovery methods, which aim to identify cause-and-effect relationships from observational data, offer a promising pathway to uncover the mechanisms driving extreme wildfires. While increasingly applied in environmental sciences, their use in wildfire prediction remains limited (de Rivera et al. 2020, Zhang et al. 2024, Zhao et al. 2024).This study will use causal discovery to identify key drivers of extreme wildfire in the Mediterranean, and further integrate the causal graphs into a stand-alone model of wildfire spread. This approach aims to move beyond correlation-based models, improve our understanding of extreme wildfire behaviour and inform more robust mitigation strategies. Study Area and Data We will use the Mesogeos dataset (Kondylatos et al. 2023), designed for wildfire modelling in the Mediterranean region. Spanning 17 years (2006–2022) at a 1 km² spatial and daily temporal resolution, it includes meteorological variables (e.g., temperature, wind speed), vegetation indices (e.g., NDVI, LAI), and human activity indicators (e.g., population density, road proximity). Wildfire data include MODIS fire ignitions and burned areas from EFFIS. Methods Extreme Wildfire Definition and Sampling In this study, we define extreme wildfires as those that are exceptionally large in size. To identify these events, we will first extract the final burned areas associated with each fire ignition recorded in the Mesogeos dataset. Since the classification of large fires is inherently subjective and varies by region, we will adopt a data-driven approach based on an absolute quantitative threshold. Specifically, we will define extreme wildfires as those exceeding the 99th percentile of fire sizes, though this threshold may be adjusted to align with extreme fire events documented in national fire reports. While this method provides a straightforward and reproducible way to define extreme events, we acknowledge its limitations. Future work will refine this approach by incorporating region-specific thresholds and additional contextual factors to improve geographic relevance. Phase I: Causal Discovery Using local variables from Mesogeos, averaged over final burned areas and lagged to time t, we will estimate causal graphs for extreme events via Python’s Tigramite library with the PCMCI method (Runge et al. 2019). PCMCI detects time-lagged causal associations in large nonlinear datasets through iterative conditional independence testing. To ensure robustness, we will assess graph stability across hyperparameters and selected drivers, and validate graphs through expert knowledge. Phase II: Causal Fire Spread Model We will develop a fire spread model incorporating causal mechanisms from Phase I. This model will integrate spatiotemporal fire dynamics, causal dependencies constraining fire spread, and dynamic weather and fuel inputs. By explicitly modeling causal interactions, it aims to improve early warning systems and risk assessments under future climate scenarios. The causal model’s performance will be benchmarked against statistical models to evaluate its predictive accuracy and robustness. Expected Results We expect that the data-driven approach proposed in this study will enhance the predictability of extreme wildfires by reducing confounding effects and capturing key drivers of extreme fire events. Compared to purely statistical approaches, incorporating causal structures should lead to more reliable predictions, particularly in out-of-sample applications or under changing environmental conditions. Furthermore, the causal fire spread model will provide insights into how climate, vegetation, and anthropogenic factors interact to drive fire spread, supporting fire prevention and mitigation strategies.
Read full abstract