Abstract

Data-driven algorithms have been widely used as effective tools to mimic hydrologic systems. Unlike black-box models, decision tree algorithms offer transparent representations of systems and reveal useful information about the underlying process. A popular class of decision tree models is model tree (MT), which is designed for predicting continuous variables. Most MT algorithms employ an exhaustive search mechanism and a pre-defined splitting criterion to generate a piecewise linear model. However, this approach is computationally intensive, and the selection of the splitting criterion can significantly affect the performance of the generated model. These drawbacks can limit the application of MTs to large datasets. To overcome these shortcomings, a new flexible Model Tree Generator (MTG) framework is introduced here. MTG is equipped with several modules to provide a flexible, efficient, and effective tool for generating MTs. The application of the algorithm is demonstrated through simulation of controlled discharge from several reservoirs across the Contiguous United States (CONUS).

Highlights

  • Data-driven models are effective tools for simulating nonlinear and complex systems, and have been widely used for regression and classification problems [1,2]

  • We evaluate the performance of the Model Tree Generator (MTG) framework along with CART and M50 on the selected case studies

  • The superior performance of the MTG is due to the regressed lines employed in the terminal nodes

Read more

Summary

Introduction

Data-driven models are effective tools for simulating nonlinear and complex systems, and have been widely used for regression and classification problems [1,2]. These models rely on statistical and numerical approaches for simulating the underlying system, rather than employing physics-based equations [3,4]. Among data-driven approaches decision tree (DT) algorithms offer transparent representations of systems [5,6], in contrast to black-box models with uninterpretable or hidden logic [7]. The DT algorithms describe and represent a dependent (response) variable by partitioning the independent (explanatory) variables’ space into clusters of data [8,9]. Simplicity and accuracy of DT algorithms make them attractive tools among practitioners in different fields of study [10], including remote sensing [11,12,13], water resources management [14,15,16,17,18] and hydrology [19,20].

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.