Virtual coupling represents a novel railway operational mechanism, wherein several trains share state information and adjust their behaviours based on control objectives, which include safety (collision avoidance) assurance and stability (uniform velocity and constant spacing between trains) maintenance. However, the conflict between safety and stability requirements becomes more apparent, as the closer spacing between trains. Moreover, the hybrid nature of the train platoon operation, which involves both discrete events and continuous dynamics, also poses a significant barrier to the control strategy synthesis. This paper presents the Virtually Coupled Train Sets (VCTS) dynamic modelling framework and strategy synthesis approach to address these problems. A Hybrid Markov Decision Process (HMDP) is adopted for relative longitudinal dynamics modelling of trains, considering the influence of the uncertainty dynamic of trains ahead. A Stackelberg game-based strategy synthesis algorithm is applied for the safe guarantee by overcoming the uncertainty, and a Q-learning method is used for stability strategy selection from the safe strategy. The results showed a 17.95% expansion in the safe state space when compared to the general relative braking scheme (which assumes maximal acceleration during the delay horizon). In a four-train tracking operation scenario, compared with Q-learning without safe constraints, our approach gives similar stability performance; compared with Q-learning with general relative braking safe constraints, our approach not only assurance safety but has a better stability performance.