Abstract Association prediction between diseases and genes is a critical step in revealing the molecular mechanisms of diseases and developing drug treatment strategies. With the explosive growth of data in the biomedical field, how to effectively utilize these data for accurate prediction has become a hotspot and challenge in current research. To overcome the limitations of current prediction methods in dealing with complex biological network structures and feature extraction, this study proposes AGCNAF, a method that combines an unsupervised Graph Convolutional Network (GCN) and a multi-head attention mechanism. The metagraph-guided random walk strategy enables AGCNAF to capture local and high-order topological structures in the graph, while GCN is responsible for realizing deep feature extraction of these structures. By incorporating similarity features through the multi-attention mechanism, AGCNAF achieves effective integration of global and local features, which significantly improves the prediction performance. By incorporating similarity features through the multi-attention mechanism, AGCNAF achieves effective integration of global and local features, which significantly improves the prediction performance. By utilizing the machine learning binary classification model for prediction, the experimental results through five-fold cross-validation show that AGCNAF demonstrates significant advantages in prediction performance compared to existing methods, with its AUC and AUPR reaching 0.9686 and 0.9709, respectively, and the AUC up to 0.9812 under specific conditions. 
To verify the practical application value of AGCNAF, this study also conduct case studies on Alzheimer's disease, lung cancer, and breast cancer. The results further confirm the excellent performance of AGCNAF in identifying potential disease-gene associations, which opens up new possibilities for future disease-gene research.
Read full abstract