Abstract The data base in digital twin technology includes various data such as the structure, performance, and operating status of the physical object. And the data base is the key point for the construction of the digital twin system. However, the data bases of different digital twin systems usually contain different information. In this paper, the flow state of the pumping station under different angles of four blades is selected as the research object. The research methods in this paper mainly include Latin hypercube sampling, numerical simulation and neural network prediction. The Latin hypercube sampling is used to select the typical Angle difference model. The numerical simulation is used to obtain flow field information. The neural network prediction method is used to construct a complete data base. The study incorporates an actual large-scale mixed-flow pump station to head, efficiency, and shaft power as the prediction targets. It utilizes the different angle difference data of ten sets of four blades as the training set, forty sets of remaining samples as the experiment set, and employs the backpropagation neural network method for constructing the prediction network. By changing the experiment set and test set, the method can efficiently complete the construction of data base including actual test and numerical simulation, which provides a solid foundation for the construction of digital twin platform.