Abstract

Disease gene prediction is to date one of the main computational challenges of precision medicine. It is still uncertain if disease genes have unique functional properties that distinguish them from other non-disease genes or, from a network perspective, if they are located randomly in the interactome or show specific patterns in the network topology. In this study, we propose a new method for disease gene prediction based on the use of biological knowledge-bases (gene-disease associations, genes functional annotations, etc.) and interactome network topology. The proposed algorithm called MOSES is based on the definition of two somewhat opposing sets of genes both disease-specific from different perspectives: warm seeds (i.e., disease genes obtained from databases) and cold seeds (genes far from the disease genes on the interactome and not involved in their biological functions). The application of MOSES to a set of 40 diseases showed that the suggested putative disease genes are significantly enriched in their reference disease. Reassuringly, known and predicted disease genes together, tend to form a connected network module on the human interactome, mitigating the scattered distribution of disease genes which is probably due to both the paucity of disease-gene associations and the incompleteness of the interactome.

Highlights

  • Precision medicine has been defined as “an emerging approach for disease treatment and prevention that takes into account individual variability in genes, environment, and lifestyle for each person.” [1]

  • As MOSES has been thought to exploit data integration in the prediction of new disease genes, we considered only the subset of 27 diseases and in Table 1, we show for all of them: the number of warm seeds (WSs), the number of genes identified by MOSES applying the first constraint of network-based distance and the number of cold seeds (CSs)

  • It is worth noting that the application of the functional distance constraint further filters the set of peripheral genes proving that the integration between protein-protein interactome topology and gene functional annotations databases allows to appropriately identify the two opposing sets of gens

Read more

Summary

Introduction

Precision medicine has been defined as “an emerging approach for disease treatment and prevention that takes into account individual variability in genes, environment, and lifestyle for each person.” [1] This definition is mainly related to the experimental, methodological, and technological developments of the last decades (e.g., generation sequencing) that gave birth to new possibilities in the practice of healthcare based on individually tailored therapies. The identification of specific disease genes is often impaired by gene pleiotropy, by the polygenic nature of many diseases, by the influence of a plethora of environmental factors, and by genome variability [6] Various experimental techniques such as genome-wide association studies (GWAS) and linkage analysis are used to identify new seed genes, but the disadvantage of these high-throughput techniques is that often, they provide long lists of candidate genes and require validation procedures that make these methods time-consuming and expensive

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.