Abstract

Abstract Globally available environmental observations (EOs), specifically from satellites and coupled Earth system models, represent some of the largest datasets of the digital age. As the volume of global EOs continues to grow, so does the potential of these data to help Earth scientists discover trends and patterns in Earth systems at large spatial scales. To leverage global EOs for scientific insight, Earth scientists need targeted and accessible exposure to skills in reproducible scientific computing and spatiotemporal data science, and to be empowered to apply their domain understanding to interpret data-driven models for knowledge discovery. The Generalizable, Reproducible, Robust, and Interpreted Environmental (GRRIEn) analysis framework was developed to prepare Earth scientists with an introductory statistics background and limited/no understanding of programming and computational methods to use global EOs to successfully generalize insights from local/regional field measurements across unsampled times and locations. GRRIEn analysis is generalizable, meaning results from a sample are translated to landscape scales by combining direct environmental measurements with global EOs using supervised machine learning; robust, meaning that the model shows good performance on data with scale-dependent feature and observation dependence; reproducible, based on a standard repository structure so that other scientists can quickly and easily replicate the analysis with a few computational tools; and interpreted, meaning that Earth scientists apply domain expertise to ensure that model parameters reflect a physically plausible diagnosis of the environmental system. This tutorial presents standard steps for achieving GRRIEn analysis by combining conventions of rigor in traditional experimental design with the open-science movement. Significance Statement Earth science researchers in the digital age are often tasked with pioneering big data analyses, yet have limited formal training in statistics and computational methods such as databasing or computer programming. Earth science researchers often spend tremendous amounts of time learning core computational skills, and making core analytical mistakes, in the process of bridging this training gap, at risk to the reputability of observational geostatistical research. The GRRIEn analytical framework is a practical guide introducing community standards for each phase of the computational research pipeline (dataset engineering, model training, and model diagnostics) to promote rigorous, accessible use of global EOs in Earth systems research.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call