Abstract

AbstractMachine learning algorithms have been increasingly applied to spatial numerical modeling. However, it is important to understand when such methods will underperform. Machine learning algorithms are impacted by dataset shift; when modeling domains of interest present non-stationarities there is no guarantee that the trained models are effective in unsampled areas. This work aims to compare the stationarity requirement of geostatistical methods to the concept of dataset shift. Also, workflow is developed to detect dataset shift in spatial data prior to modeling, this involves applying a discriminative classifier and a two sample Kolmogorv-Smirnov test to model areas. And, when required a lazy learning modification of support vector regression is proposed to account for dataset shift. The benefits of the lazy learning algorithm are demonstrated on the well-known non-stationary Walker Lake dataset and improves root mean squared error up to 25% relative to standard SVR approach, in areas where dataset shift is present.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.