Recently, with increasing urban traffic congestion, there has been an upsurge in studies on reinforcement learning for traffic signal control (RL-TSC), which enables efficient traffic management. However, most existing RL-TSC research relies on simulation, and instances of real-world application are relatively scarce. Furthermore, existing RL-TSC methods employ a step-based approach, controlling traffic signals every short step. This approach can be inefficient in oversaturated traffic conditions, where large deviations in traffic demand between movements prevent traffic signals from being effectively managed based on the overall traffic situation. In this study, we aim to transform simulation-based RL-TSC into a more practical and applicable model. We have developed a RL-TSC method capable of responding to oversaturated traffic conditions, designing an action space that enables the agent to derive an optimal signal set for every cycle length, thereby understanding the traffic situation across all movements. During each cycle length, our proposed model conducts traffic signal optimization and identifies the optimal signal through an iterative exploration. To facilitate a fast and accurate strategy search process, we developed a kinematic wave-based mesoscopic model. This model estimates the density across the entire link, based on collected traffic data and derives the state and reward values. We have validated the field applicability of the proposed RL-TSC method through a real-world demonstration at a congested intersection in Seoul, Korea. The results indicated a significant improvement in traffic congestion, with the average queue length at the intersection reduced by up to 11.4%.