Biodiversity loss will lead to a serious decline for ecosystem services, which will ultimately affect human well-being and survival. Monitoring the spatial and temporal dynamics of grassland biodiversity is essential for its conservation and sustainable development. This study integrated ground monitoring data, Landsat remote sensing, and environmental variables in the Three Rivers Headwater Region (TRHR) from 2000 to 2021. We established a reliable model for estimating grassland species diversity, analyzed the spatial and temporal patterns, trends of change, and the driving factors of changes in grassland species diversity over the past 22 years. Among models based on diverse variable selection and machine learning methods, the random forest (RF) combined stepwise regression (STEP) model was found to be the optimal model for estimating grassland species diversity in this study, which had an R2 of 0.44 and an RMSE of 2.56 n/m2 on the test set. The spatial distribution of species diversity showed a pattern of abundance in the southeast and scarcity in the northwest. Trend analysis revealed that species diversity was increasing in 80.46% of the area, whereas 16.59% of the area exhibited a decreasing trend. The analysis of driving factors indicated that the changes in species diversity were driven by both climate change and human activities over the past 22 years in the study area, of which temperature was the most significant driving factor. This study effectively monitors grassland species diversity on a large scale, thereby supporting biodiversity monitoring and grassland resource management.