Linear regression is widely used in applied sciences and, in particular, in satellite optical oceanography, to relate dependent to independent variables. It is often adopted to establish empirical algorithms based on a finite set of measurements, which are later applied to observations on a larger scale from platforms such as autonomous profiling floats equipped with optical instruments (e.g., Biogeochemical Argo floats; BGC-Argo floats) and satellite ocean colour sensors (e.g., SeaWiFS, VIIRS, OLCI). However, different methods can be applied to a given pair of variables to determine the coefficients of the linear equation fitting the data, which are therefore not unique. In this work, we quantify the impact of the choice of “regression method” (i.e., either type-I or type-II) to derive bio-optical relationships, both from theoretical perspectives and by using specific examples. We have applied usual regression methods to an in situ data set of particulate organic carbon (POC), total chlorophyll-a (TChla), optical particulate backscattering coefficient (bbp), and 19 years of monthly TChla and bbp ocean colour data. Results of the regression analysis have been used to calculate phytoplankton carbon biomass (Cphyto) and POC from: i) BGC-Argo float observations; ii) oceanographic cruises, and iii) satellite data. These applications enable highlighting the differences in Cphyto and POC estimates relative to the choice of the method. An analysis of the statistical properties of the dataset and a detailed description of the hypothesis of the work drive the selection of the linear regression method.
Read full abstract