Fortnightly observations of water quality parameters, discharge and water temperature along the River Elbe have been subjected to a multivariate data analysis. In a previous study [Petersen, W., Bertino, L., Callies, U., Zorita, E., 2001. Process identification by principal component analysis of river-quality data. Ecol. Model. 138, 193–213] applied principal component analysis (PCA) to show that 60% of variability in the data set can be explained through just two linear combinations of eight original variables. In the present paper more advanced multivariate methods are applied to the same data set, which are supposed to suit better interpretations in terms of the underlying system dynamics. The first method, graphical modelling, represents interaction structures in terms of a set of conditional independence constraints between pairs of variables given the values of all other variables. Assuming data from a multinormal distribution conditional independence constraints are expressed by zero partial correlations. Different graphical structures with nodes for each variable and connecting edges between them can be assessed with regard to their likelihood. The second method, canonical correlation analysis (CCA), is applied for studying the correlation structures of external forcing and water quality parameters. Results of CCA turn out to be consistent with the dominant patterns of variability obtained from PCA. The percentages of variability explained by external forcing, however, are estimated to be smaller. Fitting graphical models allows a more detailed representation of interaction structures. For instance, for given discharge and temperature correlated variations of the concentrations of oxygen and nitrate, respectively, can be modelled as being mediated by variations of pH, which is a representer for algal activity. Considerably simplified graphical models do not much affect the outcomes of both PCA and CCA, and hence it is concluded that these graphical models successfully represent the main interaction structures represented by the covariance matrix of the data. The analysed conditional independence patterns provide constraints to be satisfied by directed probabilistic networks, for instance.
Read full abstract