Abstract

AbstractOne of the most important tasks of modern bioinformatics is the development of computational tools that can be used to understand and treat human disease. To date, a variety of methods have been explored and algorithms for predicting whether a protein is involved in disease are gaining in their utility. Here, we describe an algorithm for detecting protein-disease associations based on the human protein-protein interaction network, known gene-disease associations, protein sequence, and protein functional information at the molecular level. Our method, PhenoPred ("www.phenopred.org":www.phenopred.org), is supervised: first, we map each protein onto the spaces of disease and functional terms based on distance to all annotated proteins in the protein interaction network. We also encode sequence, function, physicochemical, and predicted structural properties, such as secondary structure and flexibility. We then train support vector machines to detect a protein’s disease function for a number of terms in Disease Ontology (DO). We provided evidence that, despite the noise/incompleteness of experimental data and unfinished ontology of diseases, identification of candidate genes and proteins can be successful even when a large number of candidate disease terms are predicted on simultaneously.

Highlights

  • One of the most important tasks of modern bioinformatics is the development of computational tools that can be used to understand and treat human disease

  • We propose a method to associate genes or proteins to various levels of disease classification by considering Disease Ontology (DO) information which organizes disease terms into a hierarchical structure expanding from the “disease” term to the most specific disease names in a top-down manner

  • We constructed three sets of features for predicting disease associations: (i) PPIDO features were constructed based on the distribution of shortest distances from p to other proteins in the PPI network known to be associated with specific disease terms; (ii) PPI-GO features were constructed in a similar way, but based on the shortest distances to other proteins known to be associated with specific GO terms; (iii) SPP-GO features encode various sequence, physicochemical, and other predicted properties of the protein as well as its GO terms

Read more

Summary

Introduction

One of the most important tasks of modern bioinformatics is the development of computational tools that can be used to understand and treat human disease. We present our novel approach to the prediction of protein-disease associations based on an experimental PPI network, known protein-disease associations, as well as protein sequence and functional annotation.

Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.