Abstract

<p indent="0mm">In genome research, the discovery of disease-causing pathogenic genes is a major challenge in genome research. Recently, using the available biological data, many researchers have employed computational methods for pathogenic gene prediction. However, most of these computational methods are based on gene interaction networks or other similar networks, and the potential connection between the local network of specific genes and their differential expression information is rarely considered. This paper explores the biological properties of cancerous pathogenic genes and their neighbors based on a local network structure and the differential expression information of genes. Furthermore, machine learning methods are employed for predicting cancerous pathogenic genes based on the newly discovered biological properties. First, the expression data of 21 cancers and their pathogenic genes are obtained from the TCGA (The Cancer Genome Atlas) and OMIM (Online Mendelian Inheritance in Man) databases. Human-protein interaction and tissue-specific interaction networks corresponding to each cancer gene are used as background networks to analyze the potential biological properties between the neighborhood information of different networks and the changed information of the gene expression. Then, based on the discovered biological properties, a vector representation method for gene node features is established, and the support vector machine is employed for pathogenic gene prediction. Experimental results are verified using ICGC (International Cancer Genome Consortium), COSMIC (Catalogue Of Somatic Mutations In Cancer), NCG (Network of Cancer Genes), OncoKB (Oncology Knowledge Base), and other standard databases, as well as using related literature, disease annotations and pathway enrichments. Results show that the definition of gene features can be used to distinguish cancerous pathogenic genes from other genes, thus providing powerful hypotheses for cancerous pathogenic gene prediction. Furthermore, reliable pathogenic gene candidates are obtained for related biological experiments to promote research on the pathogenic mechanism of cancer, a complex disease.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call