Accounting for multiple testing in the analysis of spatio-temporal environmental data

José Cortés,Miguel Mahecha,Alexander Brenning,Markus Reichstein

doi:10.1007/s10651-020-00446-4

Abstract

The statistical analysis of environmental data from remote sensing and Earth system simulations often entails the analysis of gridded spatio-temporal data, with a hypothesis test being performed for each grid cell. When the whole image or a set of grid cells are analyzed for a global effect, the problem of multiple testing arises. When no global effect is present, we expect alpha % of all grid cells to be false positives, and spatially autocorrelated data can give rise to clustered spurious rejections that can be misleading in an analysis of spatial patterns. In this work, we review standard solutions for the multiple testing problem and apply them to spatio-temporal environmental data. These solutions are independent of the test statistic, and any test statistic can be used (e.g., tests for trends or change points in time series). Additionally, we introduce permutation methods and show that they have more statistical power. Real-world data are used to provide examples of the analysis, and the performance of each method is assessed in a simulation study. Unlike other simulation studies, our study compares the statistical power of the presented methods in a comprehensive simulation study. In conclusion, we present several statistically rigorous methods for analyzing spatio-temporal environmental data and controlling the false positives. These methods allow the use of any test statistic in a wide range of applications in environmental sciences and remote sensing.

Highlights

A common strategy in analyzing gridded spatio-temporal data derived from remote sensing or Earth system models is to fit a statistical model at each grid cell (Julien and Sobrino 2009; Fensholt and Proud 2012; Beck and Goetz 2012; Eckert et al 2015; Zhang et al 2017)
All methods control the Familywise error rate (FWER), only the permutation method based on the maximum test statistic effectively controls the FWER at the desired nominal αglobal level
The supra-threshold cluster size (STCS) method is slightly conservative at low spatial autocorrelation (FWER between 0.02 and 0.03), but it approaches the nominal level as the autocorrelation becomes stronger

Summary

Introduction

A common strategy in analyzing gridded spatio-temporal data derived from remote sensing or Earth system models is to fit a statistical model at each grid cell (Julien and Sobrino 2009; Fensholt and Proud 2012; Beck and Goetz 2012; Eckert et al 2015; Zhang et al 2017). When attempting to analyze this image to assess its collective significance (e.g., to identify significant patterns or an overall effect), we incur the multiple testing problem This problem, which results in uncontrolled false-positive test results and consequent false scientific “discoveries”, has received relatively little attention in the environmental sciences and remote sensing, except for a small but growing number of climate science reports (Ventura et al 2004; Wilks 2006a, b). When multiple tests are performed (e.g., at the grid cell level), as is common in many environmental science studies, the probability of obtaining a significant result by chance (false positive) greatly increases. This probability of at least one false positive among a “family” of tests is called the Familywise error rate (FWER). Both of these drawbacks are further illustrated in the simulation study (Sect. 4.2)

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Environmental and Ecological Statistics	Publication Date: May 12, 2020
Citations: 18	License type: open-access

R Discovery Prime

R Discovery Prime

Accounting for multiple testing in the analysis of spatio-temporal environmental data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Environmental and Ecological Statistics

Lead the way for us

Similar Papers

On the significance of global greening trends with multiple testing – an application to five data products
José Cortés ... Markus Reichstein
-
José Cortés, et. al.José Cortés ... Markus Reichstein
23 Mar 2020
23 Mar 2020

Constrained Spatiotemporal ICA and Its Application for fMRI Data Analysis
Tahir Rasheed ... Young-Koo Lee
-
Tahir Rasheed, et. al.Tahir Rasheed ... Young-Koo Lee
01 Jan 2009
01 Jan 2009

Spatiotemporal Data Mining Problems and Methods
Eleftheria Koutsaki ... Nikolaos Papadakis
Analytics | VOL. 2
Eleftheria Koutsaki, et. al.Eleftheria Koutsaki ... Nikolaos Papadakis
14 Jun 2023
Analytics | VOL. 2

Geovisualization and harmonic analysis for the exploratory search of localized cyclic recurrences in spatio-temporal event data
Jacques Gautier ... Claire Cunty
Geomatica | VOL. 74
Jacques Gautier, et. al.Jacques Gautier ... Claire Cunty
03 Aug 2020
Geomatica | VOL. 74

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Accounting for multiple testing in the analysis of spatio-temporal environmental data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Environmental and Ecological Statistics