The tolerance of pedestrians to red light signals is a crucial factor in urban residents’ travel experiences and road traffic safety. To address the current lack of research on methods for measuring pedestrian red light tolerance time at crossings, this paper proposes a stacking model for predicting the red light tolerance time of individual pedestrians. The model employs the XGBoost (XGB), random forest (RF), support vector machine regression (SVR), and multilayer perceptron (MLP) models as primary models, with a multiple linear regression model as the secondary-layer meta-model, as well as Bayesian hyperparameter tuning. Using random survival forest (RSF) and K-means clustering methods, the dataset was divided into three categories: the low, medium, and high tolerance groups. The feature variables influencing the red light tolerance time were grouped accordingly. Stacking models were established for each tolerance and feature group. The experimental results demonstrated that the feature group voted on by multiple machine learning models performed the best in all three tolerance groups. In the low tolerance group, the mean squared error (MSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) were 6.58, 1.91, and 19.78%, respectively. For the medium tolerant group, the MSE, MAE, and MAPE were 4.82, 1.53, and 7.63%, respectively. For the high tolerance group, the MSE, MAE, and MAPE were 33.32, 3.89, and 10.14%, respectively. Individual XGB, RF, SVR, MLP, and ungrouped stacking models were established for comparative analysis of the test set. The results indicated that the proposed grouped stacking model outperformed the other models overall. The method provides a means of obtaining the probability distribution function of the length of the waiting crowd’s tolerance time for each red light range and thus determining the duration of the red light at the crossing signal for different personnel compositions.