We develop a new stochastic spatio-temporal cointegration (SSTC) framework to model spatial and temporal dependence dynamics in high-dimensional non-stationary spatio-temporal data of geological hazards. Such data, often collected from real-time remote sensing, are common in complex geophysical systems such as landslides, earthquakes and volcano eruptions. Our framework employs cointegrated vector autoregression to characterize the spatial–temporal dependence dynamics with error-correction. The framework is justified both statistically and by the domain mechanism underlying the data. By applying the SSTC method to only a small number of empirical dynamic quantile series that summarize the original large-scale data, we have achieved computational scalability with insignificant loss of spatio-temporal dynamic information. In this paper, we focus on deriving the SSTC framework and estimating the best SSTC model(s) by the maximum quasi-likelihood principle. We demonstrate the forecasting efficacy by applying our SSTC framework to radar data comprising displacement measurements recorded at 1803 locations and 5090 time states over 21.5 days. Broadly, our results provide new insights on modeling, dimension-reduction, estimation and prediction in large-scale and non-stationary, spatio-temporal data analytics. Regarding landslide forecasting, this study delivers much-needed results for timely predictions of dynamics of an impending landslide from big and dense spatio-temporal data.