Abstract. The classification of river catchments into groups with similar biophysical characteristics is useful to understand and predict their hydrological behavior. The increasing availability of remote sensing and other large-scale geospatial datasets has enabled the use of advanced data-driven approaches to classify catchments using traits such as topography, geology, climate, land cover, land use, and human influence. Unsupervised clustering algorithms based on the Euclidean distance are commonly used for trait-based classification but are not suitable for highly dimensional data. In this study we present a new network-based method for multi-scale catchment classification, which can be applied to large datasets and used to determine the traits associated with different catchment groups. In this framework, two networks are analyzed in parallel: the first being where the nodes are traits and the second being where the nodes are catchments. In both cases, edges represent pairwise similarity, and a network cluster detection algorithm is used for the classification. The trait network is used to investigate redundancy in the trait data and to condense this information into a small number of interpretable categories. The catchments network is used to classify the catchments into clusters and to identify representative catchments for the different groups using the degree centrality metric. We apply this method to classify 9067 river catchments across the contiguous United States at both regional and continental scales using 274 non-categorical traits. At the continental scale, we identify 25 interpretable trait categories and 34 catchment clusters of sizes greater than 50. We find that catchments with similar trait categories are typically located in the same region, with different spatial patterns emerging among clusters dominated by natural and anthropogenic traits. We also find that the catchment clusters exhibit distinct hydrological behavior based on an analysis of streamflow indices. This network approach provides several advantages over traditional means of classification, including better separation of clusters, the use of alternate similarity metrics that are more suitable for highly dimensional data, and reducing redundancy in the trait information. The paired catchment–trait networks enable analysis of hydrological behavior using the dominant trait categories for each catchment cluster. The approach can be used at multiple spatial scales since the network topologies adjust automatically to reflect the trait patterns at the scale of investigation. Finally, the representative catchments identified as hub nodes in the network can be used to guide transferable observational and modeling strategies. The method is broadly applicable beyond hydrology for classification of other complex systems that utilize different types of trait datasets.
Read full abstract