Accurate and intelligent building change detection greatly contributes to effective urban development, optimized resource management, and informed decision-making in domains such as urban planning, land management, and environmental monitoring. Existing methodologies face challenges in effectively integrating local and global features for accurate building change detection. To address these challenges, we propose a novel method that uses focal self-attention to process the feature vector of input images, which uses a “focusing” mechanism to guide the calculation of the self-attention mechanism. By focusing more on critical areas when processing image features in different regions, focal self-attention can better handle both local and global information, and is more flexible and adaptive than other methods, improving detection accuracy. In addition, our multi-level feature fusion module groups the features and then constructs a hierarchical residual structure to fuse the grouped features. On the LEVIR-CD and WHU-CD datasets, our proposed method achieved F1-scores of 91.62% and 89.45%, respectively. Compared with existing methods, ours performed better on building change detection tasks. Our method therefore provides a framework for solving problems related to building change detection, with some reference value and guiding significance.