Existing research on automatic river network classification methods has difficulty scientifically quantifying and determining feature threshold settings and evaluating weights when calculating multi-indicator features of the local and overall structures of river reaches. In order to further improve the accuracy of river network classification and evaluate the feature weight, this paper proposes an automatic grading method for river networks based on ensemble learning in CatBoost. First, the graded river network based on expert knowledge is taken as the case; with the support of the existing case results, a total of eight features from the semantic, geometric, and topological aspects of the river network were selected for calculation. Second, the classification model, obtained through learning and training, was used to calculate the classification results of the main stream and tributaries of the river reach to be classified. Furthermore, the main stream river reaches were connected, and the main stream rivers at different levels were hierarchized to achieve river network classification. Finally, the Shapley Additive explanation (SHAP) framework for interpreting machine learning models was introduced to test the influence of feature terms on the classification results from the global and local aspects, so as to improve the interpretability and transparency of the model. Performance evaluation can determine the advantages and disadvantages of the classifier, improve the classification effect and practicability of the classifier, and improve the accuracy and reliability of river network classification. The experiment demonstrates that the proposed method achieves expert-level imitation and has higher accuracy for identifying the main stream and tributaries of river networks. Compared with other classification algorithms, the accuracy was improved by 0.85–5.94%, the precision was improved by 1.82–9.84%, and the F1_Score was improved by 0.8–5.74%. In this paper, CatBoost is used for river network classification for the first time, and SHAP is used to explain the influence of characteristics, which improves the accuracy of river network classification and enhances the interpretability of the classification method. By constructing a reasonable hierarchy, a better grading effect can be achieved, and the intelligence level of automatic grading of river networks can be further improved.
Read full abstract