Once a critical node is destroyed, the interdependent network is prone to experience severe cascading failure. Due to the coupling, traditional methods are challenging to apply to interdependent networks. Here, we propose a novel comprehensive model based on machine learning. The main work of this data-driven approach is to train the model on a small set of nodes (5 % of the graph) and do the critical node identification on the rest. We collect node centrality indicators to describe the node features and provide informative input data from different dimensions. The uniform node sampling is improved to cluster oversampling, which combines K-means and Synthetic Minority Over-sampling Technique (SMOTE) to select and recreate uniformly distributed training samples. We optimize the XGBoost based on the Genetic algorithm (GA) to overcome the instability of manual parameters. Kendall's τ correlation coefficient, Jaccard similarity coefficient, R2, and RMSE are used as the model performance evaluation metrics. Experiment results confirm that the proposed GA-XGBoost model outperforms others, demonstrating higher adaptability and stability in various situations. The heuristic algorithm-optimized machine learning model offers a viable solution for identifying critical nodes in interdependent networks, which is of great significance for controlling virus propagation and preventing failures.
Read full abstract