A novel candidate disease gene prioritization method using deep graph convolutional networks and semi-supervised learning

Saeid Azadifar,Ali Ahmadi

doi:10.1186/s12859-022-04954-x

Saeid Azadifar, Ali Ahmadi

Open Access

https://doi.org/10.1186/s12859-022-04954-x

Copy DOI

Journal: BMC bioinformatics	Publication Date: Oct 14, 2022
Citations: 6	License type: open-access

Affiliation: K.N.Toosi University of Technology

Abstract

BackgroundSelecting and prioritizing candidate disease genes is necessary before conducting laboratory studies as identifying disease genes from a large number of candidate genes using laboratory methods, is a very costly and time-consuming task. There are many machine learning-based gene prioritization methods. These methods differ in various aspects including the feature vectors of genes, the used datasets with different structures, and the learning model. Creating a suitable feature vector for genes and an appropriate learning model on a variety of data with different and non-Euclidean structures, including graphs, as well as the lack of negative data are very important challenges of these methods. The use of graph neural networks has recently emerged in machine learning and other related fields, and they have demonstrated superior performance for a broad range of problems.MethodsIn this study, a new semi-supervised learning method based on graph convolutional networks is presented using the novel constructing feature vector for each gene. In the proposed method, first, we construct three feature vectors for each gene using terms from the Gene Ontology (GO) database. Then, we train a graph convolution network on these vectors using protein–protein interaction (PPI) network data to identify disease candidate genes. Our model discovers hidden layer representations encoding in both local graph structure as well as features of nodes. This method is characterized by the simultaneous consideration of topological information of the biological network (e.g., PPI) and other sources of evidence. Finally, a validation has been done to demonstrate the efficiency of our method.ResultsSeveral experiments are performed on 16 diseases to evaluate the proposed method's performance. The experiments demonstrate that our proposed method achieves the best results, in terms of precision, the area under the ROC curve (AUCs), and F1-score values, when compared with eight state-of-the-art network and machine learning-based disease gene prioritization methods.ConclusionThis study shows that the proposed semi-supervised learning method appropriately classifies and ranks candidate disease genes using a graph convolutional network and an innovative method to create three feature vectors for genes based on the molecular function, cellular component, and biological process terms from GO data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A novel candidate disease gene prioritization method using deep graph convolutional networks and semi-supervised learning

Abstract

Talk to us

Similar Papers

More From: BMC bioinformatics

Lead the way for us

Similar Papers

A deep graph convolutional neural network architecture for graph classification.
Yuchen Zhou ... Fanliang Bu
PloS one | VOL. 18
Yuchen Zhou, et. al.Yuchen Zhou ... Fanliang Bu
10 Mar 2023
PloS one | VOL. 18

Deep Graph Convolutional Networks Based on Contrastive Learning: Alleviating Over-smoothing Phenomenon
Rui Jin ... Rong Zhang
-
Rui Jin, et. al.Rui Jin ... Rong Zhang
01 Jan 2021
01 Jan 2021

Integrative Systems Biology Approaches to Identify and Prioritize Disease and Drug Candidate Genes
Vivek Kaimal ... Ranga Chandra Gudivada
-
Vivek Kaimal, et. al.Vivek Kaimal ... Ranga Chandra Gudivada
07 Nov 2010
07 Nov 2010

NSCGCN: A novel deep GCN model to diagnosis COVID-19
Chaosheng Tang ... Yu-Dong Zhang
Computers in Biology and Medicine | VOL. 150
Chaosheng Tang, et. al.Chaosheng Tang ... Yu-Dong Zhang
30 Sep 2022
Computers in Biology and Medicine | VOL. 150

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A novel candidate disease gene prioritization method using deep graph convolutional networks and semi-supervised learning

Abstract

Talk to us

Similar Papers

More From: BMC bioinformatics