The ecosystem of semi-arid watersheds is influenced by a combination of natural climate factors, rainfall, and habitat destruction, resulting in complex mechanisms of spatial differentiation and evolution of water ecological health. Indicator selection in mainstream water ecological health assessment methods, such as the Index of Biotic Integrity (IBI), often relies on subjective reference point choices. This approach tends to overlook the comprehensive impacts and interactions among various environmental stressors. For watersheds significantly influenced by natural climatic factors, considerable uncertainties arise, leading to a lack of scientific justification for establishing water ecological health protection goals. In this study, the nonlinear capabilities of the random forest (RF) model were applied to reduce subjectivity in traditional water ecological health assessments and to more accurately reveal the emerging spatial differentiation patterns and underlying causes of water ecological health in the Wei River Basin (WRB), the largest typical semi-arid watershed of the Yellow River in China. Our findings indicate: (1) Traditional evaluation indices indicate that the overall water ecological health of the WRB is classified as sub-healthy (60 %). The core indicators include dominant species, total algal density, and the percentage of diatom density, with no significant spatial differentiation observed. (2) An improved water ecological health assessment method for semi-arid watersheds, based on the RF model, has been developed to replace traditional subjective judgment steps. This method establishes a complex multi-input–output response relationship (R2 > 0.85) between environmental stress indicators and the biological integrity index for the WRB. (3) The model results identify key driving factors affecting changes in water ecological health in semi-arid watersheds, with the sensitivity of the new model increasing nearly 11-fold compared to traditional IBI methods. (4) Following improvements, the water ecological health characteristics of the WRB exhibit significant spatial heterogeneity, with a higher dispersion coefficient (1.21), and demonstrate enhanced nonlinear response trends to climatic factors. The application of machine learning models indicates that traditional methods may underestimate the extent of ecological health degradation in watersheds and tend to oversimplify spatial heterogeneity characteristics.