Abstract

Measuring and forecasting changes in coastal and deep-water ecosystems and climates requires sustained long-term measurements from marine observation systems. One of the key considerations in analyzing data from marine observatories is quality assurance (QA). The data acquired by these infrastructures accumulates into Giga and Terabytes per year, necessitating an accurate automatic identification of false samples. A particular challenge in the QA of oceanographic datasets is the avoidance of disqualification of data samples that, while appearing as outliers, actually represent real short-term phenomena, that are of importance. In this paper, we present a novel cross-sensor QA approach that validates the disqualification decision of a data sample from an examined dataset by comparing it to samples from related datasets. This group of related datasets is chosen so as to reflect upon the same oceanographic phenomena that enable some prediction of the examined dataset. In our approach, a disqualification is validated if the detected anomaly is present only in the examined dataset, but not in its related datasets. Results for a surface water temperature dataset recorded by our Texas A&M—Haifa Eastern Mediterranean Marine Observatory (THEMO)—over a period of 7 months, show an improved trade-off between accurate and false disqualification rates when compared to two standard benchmark schemes.

Highlights

  • From the Texas A&M—Haifa Eastern Mediterranean Marine Observatory (THEMO) [3]—which produces data samples simultaneously from 40 sensors every 30 min, we have collected more than 2.5 million data samples over a period of 18 months

  • We explore the results of our cross-sensor quality assurance (QA) scheme for the surface water temperature dataset measured at the THEMO observatory

  • We explored the use of multiple sensors for the task of validating quality assurance (QA) decisions for datasets from a marine observatory

Read more

Summary

Introduction

Understanding the ever-changing oceans, biota and atmosphere is one of the greatest global challenges. The future of measuring and forecasting trends in coastal and deep-water ecosystems and climates lies in obtaining long-term time-series from marine observation systems. A new era in ocean observation has begun—an integrated approach to the gathering and sharing of information. There are already hundreds of marine observatories, each collecting vast amounts of time-series data samples ranging from oil spill monitoring [1] to meteorological and oceanographic global coverage [2]. The acquired data are used to derive conclusions about climate change, weather patterns and marine biodiversity, and inform public opinion and legislation activities, the data acquired must be highly accurate to reflect trends of real phenomena. With billions of data samples collected, man-in-the-loop QA becomes impractical and necessitating automation. From the Texas A&M—Haifa Eastern Mediterranean Marine Observatory (THEMO) [3]—which produces data samples simultaneously from 40 sensors every 30 min, we have collected more than 2.5 million data samples over a period of 18 months

Objectives
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call