At public beaches, it is now common to mitigate the impact of water-borne pathogens by posting a swimmer's advisory when the concentration of fecal indicator bacteria (FIB) exceeds an action threshold. Since culturing the bacteria delays public notification when dangerous conditions exist, regression models are sometimes used to predict the FIB concentration based on readily-available environmental measurements. It is hard to know which environmental parameters are relevant to predicting FIB concentration, and the parameters are usually correlated, which can hurt the predictive power of a regression model. Here the method of partial least squares (PLS) is introduced to automate the regression modeling process. Model selection is reduced to the process of setting a tuning parameter to control the decision threshold that separates predicted exceedances of the standard from predicted non-exceedances. The method is validated by application to four Great Lakes beaches during the summer of 2010. Performance of the PLS models compares favorably to that of the existing state-of-the-art regression models at these four sites.
Read full abstract