Abstract

Protein-protein interaction network data provides valuable information that infers direct links between genes and their biological roles. This information brings a fundamental hypothesis for protein function prediction that interacting proteins tend to have similar functions. With the help of recently-developed network embedding feature generation methods and deep maxout neural networks, it is possible to extract functional representations that encode direct links between protein-protein interactions information and protein function. Our novel method, STRING2GO, successfully adopts deep maxout neural networks to learn functional representations simultaneously encoding both protein-protein interactions and functional predictive information. The experimental results show that STRING2GO outperforms other protein-protein interaction network-based prediction methods and one benchmark method adopted in a recent large scale protein function prediction competition.

Highlights

  • The realisation of the complex relationships between genotypes and phenotypes has been fostering the collection and analysis of genome-wide datasets of molecular interactions detected from patterns of physical binding, transcript co-expression, mutant phenotypes, etc

  • We evaluate the predictive performance of the STRING2GO-learnt functional representation (i.e. STRING2GOMashup and STRING2GONode2vec) by comparing with their corresponding raw network embedding representations

  • We compare the performance of Mashup and Node2vec methods when they are used to generate the raw network embedding representations or be the component methods of STRING2GO to learn the functional representations

Read more

Summary

Introduction

The realisation of the complex relationships between genotypes and phenotypes has been fostering the collection and analysis of genome-wide datasets of molecular interactions detected from patterns of physical binding, transcript co-expression, mutant phenotypes, etc. STRING [5] considers experimentally detected PPIs, conserved mRNA co-expression, comention in abstracts and papers, interactions from curated databases, conserved gene proximity, gene co-occurrence/co-absence and gene fusion events. Interactions in such databases are typically assigned confidence scores, which can be used for integration purposes [2, 6, 7]. These data provide valuable direct links between genes and their biological roles, and form the basis for protein function prediction methods that do not rely on traditional

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call