Abstract

Extracting the drivers from genes with mutation, and segregation of driver and passenger genes are known as the most controversial issues in cancer studies. According to the heterogeneity of cancer, it is not possible to identify indicators under a group of associated drivers, in order to identify a group of patients with diseases related to these subgroups. Therefore, the precise identification of the related driver genes using artificial intelligence techniques is still considered as a challenge for researchers. In this research, a new method has been developed using the subspace learning method, unsupervised learning, and with more constraints. Accordingly, it has been attempted to extract the driver genes with more precision and accurate results. The obtained results show that the proposed method is more to predict the driver genes and subgroups of driver genes which have the highest degree of overlap due to p-value with known driver genes in valid databases. Driver genes are the benchmark of MsigDB which have more overlap compared to them as selected driver genes. In this article, in addition to including the driver genes defined in previous work, introduce newer driver genes. The minister will define newer groups of driver genes compared to other methods the p-value of the proposed method was 9.21e-7 better than previous methods for 200 genes. Due to the overlap and newer driver genes and driver gene group and subgroups. The results show that the p value of the proposed method is about 2.7 times less than the driver sub method due to overlap, indicating that the proposed method can identify driver genes in cancerous tumors with greater accuracy and reliability.

Highlights

  • Extracting the drivers from genes with mutation, and segregation of driver and passenger genes are known as the most controversial issues in cancer studies

  • We have been able to identify all mutant genes in the tumor, many of these mutant genes have no effect on the tumor development, which are known as passenger genes

  • Previous methods have identified a list of driver genes, but because of mutation heterogeneity, some of them have not been identified properly

Read more

Summary

Wzi z

Our parameters are the weight matrix W which can reverse the relationship between a subset of samples and subspace dimensions to calculate the real values of Matrix W and Z, we used the basic method of matrix factorization, and each time we repeated the initial W, Z we obtained more accurate values. The new genes defined in this method are very similar due to p-value compared to previous works They have better overlap with breast cancer benchmarks. In 200 driver genes obtained from top to bottom ranking, the number of genes proposed by our method overlapped better than previous methods so that we have achieved p-value = 9.21e-07. Comparing p-values between the previous and the proposed methods for an average of a subset of 200 driver genes that the lowest p-value and highest mutation score which were compared by different methods in the Table 1. Black color indicates the lowest p-value and highest mutation score, for example, the gene GH2 has more overlap

Discussion
Author contributions
Findings
Additional information
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call