The operational efficiency error of a pumping station unit is affected by several factors, resulting in a large error between the theoretical and actual pumping efficiencies. To solve the problem of accurately simulating the operating efficiency of the pumping station unit, this study conducted research on efficiency simulation models of pumping based on data-driven from three aspects: ①model algorithm, ②feature input and ③response output. In this study, eight machine learning models were introduced to establish pumping unit efficiency simulation models and they were compared with the traditional polynomial regression models, and the models with better performance were selected. Then the experiment proposed using “upstream water level + downstream water level” (UWL + DWL) instead of the traditional “head” (H) as feature input to train the models, and discussed the effects on efficiency simulation accuracy with direct fitting efficiency and fitting power then calculate the efficiency through formula. An example analysis was carried out using the historical data of eight units at Pizhou Station and Suining 2nd Station on the East Line of the South-to-North Water Diversion Project. The experimental results show that ① among all methods, the models based on Gaussian Process Regression (GPR) has the most accurate in ERMS, EMA, R², and EMI indicators, R² is close to 0.92. ②After UWL + DWL was used to train the models, all the test set indicators of the GPR models improved, and the EMA indicator decreased from an average of 0.39–0.26 %. ③ The study found that the fitting efficiency directly is better than the calculation efficiency after fitting the power, but the R² of fitting the power is better than the efficiency, which may provide another idea for the optimal goal of economic operation of pumping stations. ④ On the whole, trainning the GPR models with UWL + DWL to simulate efficiency has the highest precision, for example the EMA and EMI indicators of No.4 unit in Pizhou station can be reduced from 16.49 % and 20.40–0.18 % and 1.55 % respectively. The research results have practical significance and can provide strong support for the economical operation of pump stations.