Abstract

Abstract. Data-sparse zones in scatter plots of hydrological variables can be of interest in various contexts. For example, a well-defined data-sparse zone may indicate inhibition of one variable by another. It is of interest therefore to determine whether data-sparse regions in scatter plots are of sufficient extent to be beyond random chance. We consider the specific situation of data-sparse regions defined by a linear internal boundary within a scatter plot defined over a rectangular region. An Excel VBA macro is provided for carrying out a randomisation-based significance test of the data-sparse region, taking into account both the within-region number of data points and the extent of the region. Example applications are given with respect to a rainfall time series from Israel and also to validation scatter plots from a seasonal forecasting model for lake inflows in New Zealand.

Highlights

  • A visual examination of hydrological scatter plots is a useful first step toward considering possible relationships between variables, or for evaluation of the worth of hydrological forecasting models via validation plots of observed and predicted values

  • Our focus here is not on boundary estimation as such, but rather on providing a significance test against the null hypothesis that a data-sparse zone in a scatter plot has arisen by random chance

  • Given that there are m data points within the data-sparse area (m), the null hypothesis is that a data-sparse region at least as large and containing m data points could have arisen by random chance

Read more

Summary

Introduction

A visual examination of hydrological scatter plots is a useful first step toward considering possible relationships between variables, or for evaluation of the worth of hydrological forecasting models via validation plots of observed and predicted values. Our focus here is not on boundary estimation as such, but rather on providing a significance test against the null hypothesis that a data-sparse zone in a scatter plot has arisen by random chance. The purpose of this short communication is to provide a practical significance test for the size of the area of an observed data-sparse region with a linear internal boundary in a scatter plot within the specific rectangular region which just encompasses all the data points. The approach adopted here represents a generalisation of an earlier test described by Bardsley et al (1999) which was restricted in practical application because it required the data-sparse region to contain no data points at all (m = 0). There is no general guarantee that higher levels of significance will be obtained for the test proposed here, as this is dependent on the data pattern of the scatter plot

The test
Application to validation scatter plots
Discussion and conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.