Abstract

Genes with similar functions might lead to similar phenotypes and tend to locate in a cluster in protein- protein interaction (PPI) network. The responsible genes should be identified for cardiovascular artery disease by their combined network topological features. Here we introduced a method called CTFMining to predict candidate disease genes which based on the combined network topological features mining. Four network topological features were defined to describe the network characters of genes. And then, we used each topological feature and combined topological features to screen the disease genes by training support vector machines (SVMs), respectively. It was found that using combined feature to predicted disease genes would get a better result than using single feature and an optima combined features was found to distinguish disease genes from non-disease genes. According to the optima combined feature, each candidate disease genes were predicted, and finally the intersection of 10,000 predictions was defined to be our final prediction. Finally, 224 candidate disease genes were predicted using SVM. Nearly 86% of candidate disease genes were found to be associated with CAD, which was verified by Priortizer or PandS. Candidate disease genes were likely to share the same functions with known disease genes of CAD. Our optima combined feature could be introduced to distinguish disease genes from non-disease genes well. With the increase of interaction data and further discovery of known disease genes, our method can be applied to predict novel candidate genes better.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call