Predicting the trajectory of geographical events, such as wildfire spread, presents a formidable task due to the dynamic associations among influential biophysical factors. Geo-events like wildfires frequently display short and long-range spatial and temporal correlations. Short-range effects are the direct contact and near-contact spread of the fire front. Long-range effects are represented by processes such as spotting, where firebrands carried by the wind ignite fires distant from the flaming front, altering the shape and speed of an advancing fire front. This study addresses these modeling challenges by clearly defining and analyzing the scale-dependent spatiotemporal dynamics that influence wildfire spread, focusing on the interplay between biophysical factors and fire behavior. We propose two unique attention-based spatiotemporal models using Convolutional Long Short-Term Memory (ConvLSTM) networks. These models are designed to learn and capture a range of local to global and short and long-range spatiotemporal correlations. The proposed models were tested on two datasets: a high-resolution wildfire spread dataset produced with a semi-empirical percolation model and a satellite observed wildfire spread data in California 2012–2021. Results indicate that attention-based models accurately predict fire front movements that are consistent with known wildfire spread-biophysical dynamics. Our research suggests there is considerable potential for attention mechanisms to capture the spatiotemporal behavior of wildfire spread, with model transferability, that can guide rapid deployment of wildfire management operations. We also highlight directions for future studies that focus on how the self-attention mechanism could enhance model performance for a range of geospatial applications.