Technical note: A significance test for data-sparse zones in scatter plots

V V Vetrova,W E Bardsley

doi:10.5194/hess-16-1255-2012

Abstract

Abstract. Data-sparse zones in scatter plots of hydrological variables can be of interest in various contexts. For example, a well-defined data-sparse zone may indicate inhibition of one variable by another. It is of interest therefore to determine whether data-sparse regions in scatter plots are of sufficient extent to be beyond random chance. We consider the specific situation of data-sparse regions defined by a linear internal boundary within a scatter plot defined over a rectangular region. An Excel VBA macro is provided for carrying out a randomisation-based significance test of the data-sparse region, taking into account both the within-region number of data points and the extent of the region. Example applications are given with respect to a rainfall time series from Israel and also to validation scatter plots from a seasonal forecasting model for lake inflows in New Zealand.

Highlights

A visual examination of hydrological scatter plots is a useful first step toward considering possible relationships between variables, or for evaluation of the worth of hydrological forecasting models via validation plots of observed and predicted values
Our focus here is not on boundary estimation as such, but rather on providing a significance test against the null hypothesis that a data-sparse zone in a scatter plot has arisen by random chance
Given that there are m data points within the data-sparse area (m), the null hypothesis is that a data-sparse region at least as large and containing m data points could have arisen by random chance

Summary

Introduction

A visual examination of hydrological scatter plots is a useful first step toward considering possible relationships between variables, or for evaluation of the worth of hydrological forecasting models via validation plots of observed and predicted values. Our focus here is not on boundary estimation as such, but rather on providing a significance test against the null hypothesis that a data-sparse zone in a scatter plot has arisen by random chance. The purpose of this short communication is to provide a practical significance test for the size of the area of an observed data-sparse region with a linear internal boundary in a scatter plot within the specific rectangular region which just encompasses all the data points. The approach adopted here represents a generalisation of an earlier test described by Bardsley et al (1999) which was restricted in practical application because it required the data-sparse region to contain no data points at all (m = 0). There is no general guarantee that higher levels of significance will be obtained for the test proposed here, as this is dependent on the data pattern of the scatter plot

The test

Application to validation scatter plots

Discussion and conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Hydrology and Earth System Sciences	Publication Date: Apr 26, 2012
Citations: 4	License type: CC BY 3.0

R Discovery Prime

R Discovery Prime

Technical note: A significance test for data-sparse zones in scatter plots

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Hydrology and Earth System Sciences

Lead the way for us

Similar Papers

저수지 최적수질측정망 구축시스템 개발 및 적용
Yo-Sang Lee ... Yang-Jin Ban
Journal of Korea Water Resources Association | VOL. 44
Yo-Sang Lee, et. al.Yo-Sang Lee ... Yang-Jin Ban
30 Apr 2011
Journal of Korea Water Resources Association | VOL. 44

مقایسه عملکرد مدل های هوش مصنوعی در تخمین پارامترهای کیفی آب رودخانه در دوره های کم آبی و پرآبی
...
-
, et. al. ...
19 Feb 2017
19 Feb 2017

Chemometrics application in biotech processes: assessing comparability across processes and scales
Anurag S Rathore ... Velu Mahalingam
Journal of Chemical Technology & Biotechnology | VOL. 89
Anurag S Rathore, et. al.Anurag S Rathore ... Velu Mahalingam
16 Jun 2014
Journal of Chemical Technology & Biotechnology | VOL. 89

Mendelian randomization study on the causal relationship between body mass index and deep vein thrombosis
...
-
, et. al. ...
30 Nov 2019
30 Nov 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Technical note: A significance test for data-sparse zones in scatter plots

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Hydrology and Earth System Sciences