Gard (2009) compared the performance of two instream flow models, PHABSIM and River2D, that combine oneor twodimensional (1D or 2D) hydraulic models with simple biological models called habitat suitability criteria (HSC) to estimate an index of habitat called weighted usable area (WUA). PHABSIM and related models such as River2D are widely used for setting instream flow standards, so tests of the models are badly needed (Williams 2001). Although Gard (2009) collected useful data for testing the models, he failed to make good use of them, or to present the data in a form that would allow others to do so. Here, I summarize Gard’s (2009) analysis, critique it, and suggest how the data could be better used. I also suggest a statistically and biologically superior alternative to the HSC used in both the 1D and 2D instream flow models. I do not discuss the basic adequacy of such models for their intended function, which can be questioned (e.g. Rosenfeld 2003, Anderson et al. 2006). Gard (2009) applied the models to 14 sites on 3 rivers in the Central Valley of California; 1 on the Merced River, 5 on the American River, and 8 on the upper Sacramento River. Sites ranged from 0.3 to 10.4 channel widths in length. Both models divide the study areas into ‘cells’ and predict the depth, velocity and substrate size for each, but the size and shape of the cells vary between the models and among the sites, and the 1Dmodel could not be used in parts of the sites with transverse flows or other complex flow patterns. HSC for spawning steelhead and three runs of Chinook salmon (fall, late fall and winter) were developed for depth, velocity, and substrate size, mainly from redds in the study rivers, and the locations of Chinook redds were determined with surveying equipment or GPS readings. The HSC take values between 0 and 1, depending on the value of the relevant habitat variable. The same HSC were used in both models and were combined by simple multiplication to produce a ‘composite suitability index’ (CSI), that also takes values from 0 to 1, for each modelled cell in the study sites at each modelled discharge. WUA is the sum of the areas of the cells multiplied by their respective CSIs, normalized by the channel length. Running each model for a range of discharge at each site produces curves of WUA over discharge. Gard (2009) used Mann–Whitney U tests to determine whether, for Chinook, there were statistically significant differences between the CSI, at the appropriate discharge, of cells that did or did not include a redd site. For both species, he tested whether there were significant differences between the sets of WUA curves produced by the two models for the sites and runs using Kolmogorov–Smirnov tests. Gard (2009) found that with both the 1D and 2D models, the CSI of used cells differed significantly from the CSI of unused cells for fall and winter Chinook, but not for late fall Chinook, for which the sample size was small. He also found only one significant difference among the 55 pairs of WUA curves generated by the models. On that basis, he found that PHABSIM could ‘relatively accurately predict the CSI of redd locations’, and that the study ‘found little difference between the PHABSIM and River2D in flow–habitat relationships’, but that River2D could model parts of the river with transverse flows or other hydraulic features that precluded use of the 1D model. Testing whether the CSI of used cells is significantly greater than that of unused cells essentially is testing whether PHABSIM or related models do better than a random guess. This is a weak test, as has been pointed out previously (Williams 1997, Williams et al. 1999), and confuses statistical significance with biological or practical significance. In general, it is more useful to consider the magnitude of differences (effect size) and
Read full abstract