This paper describes the development of a general operating policy for a water supply system using the methodology of data mining. To define an operating policy using this approach, both a single-reservoir and a multireservoir water system were modeled and optimized for a set of historical inflows. These optimization results defined the best possible performance for the systems with historical hindsight, and were used as input for the data mining process. The data mining algorithm then generated the set of control rules that gave the best historical operating policy. The data mining tool used in this work is based on the induction tree technique, C5.0, reported by Quinlan in 1993. However, the process of reservoir control rule extraction is not straightforward and requires several data preparation steps to enhance the performance of the data mining algorithm. To demonstrate the effectiveness of the rules developed through data mining, simulation runs of the system were performed. The results of these simulations were compared with simulation results using operating policies derived from linear regression. Another comparison between operating rules derived using different methodologies was performed for the multireservoir system where, in addition to data mining and regression-based rules, there were rules available from the U.K. Environment Agency (South West). The paper shows that “data-mined” rules come closest to the optimization results.
Read full abstract