Chlorophyll-a (Chl-a) concentration indicates the abundance of phytoplankton biomass and its spatiotemporal variations fundamentally modulate the ecosystem dynamics in coastal waters. Compared to conventional low-frequency shipboard measurements at a limited number of sampling locations, satellite data provide a better spatial and temporal coverage allowing for a more synoptic view of Chl-a variabilities in large coastal systems such as Chesapeake Bay. We used Visible Infrared Imaging Radiometer Suite (VIIRS) satellite data from 2011 to 2018 to (1) analyze the Chl-a variability in Chesapeake Bay, (2) examine the robustness of an interpolation method to fill up satellite data gaps, and (3) train a machine-learning-based data-driven model to simulate high-resolution Chl-a variations. The dataset shows clear seasonality with notable spring and summer peaks throughout the bay; this is different from in situ observations at mainstem stations where the water is deep. The seasonality of Chl-a varies in different regions, with maxima concentration occurring in spring for regions near the mouths of major tributaries, winter near the bay entrance, and summer elsewhere. A machine-learning-based data-driven model, together with Data Interpolating Empirical Orthogonal Functions (DINEOF), is applied to simulate the high-resolution Chl-a variations. The DINEOF is used to efficiently estimate the missing records. Driven by external forcing including river discharge, nutrient loadings, solar radiation, wind, and air temperature, the data-driven model shows an overall satisfactory performance in reproducing the spatiotemporal variations of Chl-a, with a bay-wide averaged root mean square error of 1.85 ug/l. By combining DINEOF and machine learning, this study demonstrates the potential of using data-driven model to predict high-resolution spatiotemporal variations of water quality in coastal waters.
Read full abstract