Abstract

Gene expression data can serve for analyzing the genes with changed expressions, the correlation between genes and the influence of different circumstance on gene activities. However, labeling a large number of gene expression data is laborious and time-consuming. The insufficient labeled data pose a challenge to construct the deep learning model. Currently, some graph neural networks (GNN) based on semi-supervised learning mechanism only focus on the feature space and sample space of gene expression data, possibly affecting the accuracy. This paper puts forward a novel semi-supervised graph neural network model (SFWN). Firstly, we use the external knowledge of gene expression data for constructing a feature graph, a similarity kernel, and a sample graph for the first time. Later, a novel semi-supervised learning algorithm (SGA) is proposed to extract the data relationship and obtain the global sample structure better. A graph sparse module (SGCN) is also proposed to process sparse representation with gene expression data classification. To overcome the over smoothing problem, a new feature calculation method based on two spaces is proposed to feature representation analysis and calculation in this model. According to a lot of experiments and ablation studies conducted on several public datasets, SFWN exhibits a better effect and is superior to the state-of-the-art approaches (the accuracy and F1-Score are 0.9993 and 0.9899, respectively). Experimental results showed that the proposed SFWN model has strong gene expression feature learning and representation ability, and may provide a new insight and tool for relevant disease diagnosis and clinic practice.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call