The concentration of chlorophyll a in phytoplankton and periphyton represents the amount of algal biomass. We compiled an 18-year record (2005–2022) of pigment data from water bodies across the United States (US) to support efforts to develop process-based, machine learning, and remote sensing models for prediction of harmful algal blooms (HABs). To our knowledge, this dataset of nearly 84,000 sites and over 1,374,000 pigment measurements is the largest compilation of harmonized discrete, laboratory-extracted chlorophyll data for the US. These data were compiled from the Water Quality Portal (WQP) and previously unpublished U.S. Geological Survey’s National Water Quality Laboratory (NWQL) data. Data were harmonized for reporting units, pigment type, duplicate values, collection depth, site name, negative values, and some extreme values. Across the country, data show great variation by state in sampling frequency, distribution, and methods. Uses for such data include the calibration of models, calibration of field sensors, examination of relationship to nutrients and other drivers, evaluation of temporal trends, and other applications addressing local to national scale concerns.
Read full abstract