Remote sensing water quality monitoring technology can effectively supplement the shortcomings of traditional water quality monitoring methods in spatiotemporal dynamic monitoring capabilities. At present, although the spectral feature-based remote sensing water quality inversion models have achieved many successes, there could still be a problem of insufficient generalization ability in monitoring the water quality of complex river networks in large cities. In this paper, we propose a spectro-environmental factors integrated ensemble learning model for urban river network water quality inversion. We analyzed the correlation between water quality parameters, spectral reflectance, and environmental factors based on an in-situ dataset collected in the northern part of Shanghai. Using the Hot Spot Analysis (Getis-Ord Gi*), we found that river network water quality parameters have different patterns in different urban functional zones. Furthermore, daily average temperature, total rainfall within the seven days, and several band combinations were also selected as the environmental and spectral features using factor analysis and Pearson correlation coefficient analysis. After the feature analysis, the spectro-environmental factors integrated ensemble learning model was trained. Compared with the spectral-based machine learning inversion models, the coefficients of determination R2 increased by about 0.50. Our model was also tested in three different test areas within and outside the in-situ sampling areas in Shanghai based on low-altitude multispectral remote sensing images. The R2 results for total phosphorus (TP), ammonia nitrogen (NH3-N), and chemical oxygen demand (COD) within the in-situ sampling areas were 0.52, 0.58, and 0.56 respectively. The mean absolute percentage error (MAPE) results were 53.36%, 63.95%, and 22.46% respectively. After adding the area outside the in-situ sampling areas, the R2 results for TP, NH3-N, and COD were 0.47, 0.47, and 0.53. The MAPE were 49.38%, 74.46%, and 20.49%. Our research provided a new remote sensing water quality inversion method to be utilized in complex urban river networks which exhibited solid accuracy and generalization ability.