ABSTRACT The urban traffic signal-controlled intersections are of great significance for solving the problem of urban road congestion. Previous research on congestion prediction mainly aggregated data at the level of road segments or traffic flow at a coarse regulated time interval. Fine-grained prediction of congestion events at the lane-level and cycle-level enables detailed a understanding of spatio-temporal dependencies, leading to congestion reduction, improved efficiency. This paper presents a Spatio-Temporal Neural Point Process (STNPP) model that combines Graph Neural Networks and Neural Temporal Point Process to predict congestion events at urban intersections. The proposed model allows for complete prediction of congestion events, including their occurrence, development, dissipation. In the process of spatial correlation modeling, graph neural networks are used to model the spatial relationships between both region and intersections. The current intersection and its upstream/downstream areas are modeled separately. To model the temporal correlations at individual intersections, we focus on a specific lane and capture the evolution of congestion events using the Neural Point Process Gated Recurrent Unit (NPPGRU), which captures the temporal granularity changes of signal-controlled cycles in congestion events. Using actual traffic speed and signal-controlled data from Hangzhou city, we validate that the proposed method achieves stable predictive performance.