Drivers who perform frequent high-risk events (e.g., hard braking maneuvers) pose a significant threat to traffic safety. Existing studies commonly estimated high-risk event occurrence probabilities based upon the assumption that data collected from different time periods are independent and identically distributed (referred to as i.i.d. assumption). Such approach ignored the issue of driving behavior temporal covariate shift, where the distributions of driving behavior factors vary over time. To fill the gap, this study targets at obtaining time-invariant driving behavior features and establishing their relationships with high-risk event occurrence probability. Specifically, a generalized modeling framework consisting of distribution characterization (DC) and distribution matching (DM) modules was proposed. The DC module split the whole dataset into several segments with the largest distribution gaps, while the DM module identified time-invariant driving behavior features through learning common knowledge among different segments. Then, gated recurrent unit (GRU) was employed to conduct time-invariant driving behavior feature mining for high-risk event occurrence probability estimation. Moreover, modified loss functions were introduced for imbalanced data learning caused by the rarity of high-risk events. The empirical analyses were conducted utilizing online ride-hailing services data. Experiment results showed that the proposed generalized modeling framework provided a 7.2% higher average precision compared to the traditional i.i.d. assumption based approach. The modified loss functions further improved the model performance by 3.8%. Finally, benefits for the driver management program improvement have been explored by a case study, demonstrating a 33.34% enhancement in the identification precision of high-risk event prone drivers.