Accurate forecasting of building cooling load is essential for optimizing control strategies and energy efficiency in heating, ventilation, and air conditioning (HVAC) systems. However, existing machine learning and deep learning algorithms encounter challenges in addressing the dynamics and uncertain nature of cooling load demand, as they solely focus on temporal analysis. To tackle this issue, the paper proposes CRG-Informer, a deep learning approach that integrates spatio-temporal features to enhance multi-step predictive performance. Firstly, a time series spatial control relationship graph set is constructed based on the interconnected structure, static attributes, and control characteristics of the cooling system. Then, a graph neural network (GNN) is designed for graph embedding to extract spatial correlation information at each instance. Additionally, the Informer model with the ProbSparse self-attention mechanism is utilized to capture long-term temporal dependencies among input sequences. This integration of components enables feature fusion, and the proposed approach is evaluated using data from an office building in Hangzhou, China, over an entire cooling season. Extensive experiments were conducted to compare our method with five superior baseline models for 6, 12, 24, 72, and 144-step cooling load forecasting across four types of data acquisition intervals. The results demonstrate that the CRG-Informer, through spatio-temporal feature fusion, outperforms existing methods and provides a novel solution for improving cooling load forecasting performance.