Efficient transportation planning and management are critical for ensuring the smooth operation of rail transit systems, particularly in urban areas with high passenger demand. Real-time prediction of transit flow during underground incidents is vital for ensuring passenger safety, minimizing disruptions, and optimizing resource allocation. This study presents a comprehensive framework to address the considerable challenges associated with short-term origin–destination (OD) flow prediction, particularly during incidents. We propose integrating convolutional long short-term memory cells with self-attention cells to learn the long-range spatiotemporal information of historical OD flows. To overcome the absence of a real-time OD matrix, we introduce an improved graph convolution operation, considering the local connection of ridership, to capture vital spatiotemporal interaction patterns. Quantitative and qualitative analyses were conducted to evaluate the performance of the framework, providing insights into the role of each module. Experimental results suggest that incorporating incident matrix theory enhances the accuracy of capturing declining OD flows during incidents by up to approximately 25%. Furthermore, integrating external features like weather and time comprehensively assesses the impact of erratic change factors, especially during off-peak periods. Additionally, introducing more than one cycle of historical OD flow as an input may introduce noise, and increased attention is needed for OD flows with average daily flows exceeding 30 passengers. Designed with a high degree of modularity and adaptability, the proposed framework serves as a valid tool for transportation professionals and researchers to address evolving challenges in the management and operation of urban transportation systems.