Abstract

Essential proteins play significant roles in cell survive. In current years, some Protein-Protein Interaction (PPI) data have been discovered in saccharomyces cerevisiae. Due to the high costs of biological experiments, a growing number of computational models are adopted to predict essential proteins. However, the identification accuracy of these computational models still has broad space for improvement. In this paper, a novel prediction model called NPRI is proposed to infer potential essential proteins based on the PageRank algorithm. In NPRI, a new heterogeneous Protein-Domain network will be constructed by integrating three kinds of networks such as the weighted PPI network, the Domain-Domain network and the initial Protein-Domain network first. Here, these three kinds of networks are established in accordance with gene expression data, original PPI network and known Protein-Domain network respectively. Next, based on the newly constructed heterogeneous Protein-Domain network, we will extract functional features and topological characteristics for each protein to further construct a novel distribution rate network. And then, an improved iteration method based on the PageRank algorithm will be implemented on the novel distribution rate network to infer essential proteins. Finally, in order to evaluate the performance of NPRI, we will compare NPRI with other state-of-the-art prediction models, and simulation results show that NPRI can achieve reliable identification accuracies of 90%, 84.5% and 79% in top 100, 200 and 300 predicted candidate essential proteins separately, which outperform these competitive models remarkably, and means that NPRI is a promising framework for identifying essential proteins as well.

Highlights

  • Increasing evidences indicate that proteins are involved in almost all life activities, while the functions and importance of different proteins in life activity are different

  • As illustrated in the following Fig.1, NPRI consists of four major steps: Step1: First, two original Protein-Protein Interaction (PPI) networks will be constructed based on the datasets of known PPIs downloaded from two public databases separately

  • Step3: based on the original PPI network NI, we can extract some critical topological features for each protein first, and through combining the information of subcellular localization and orthologous downloaded from public databases, an initial score can be calculated for each protein and domain in NHPD

Read more

Summary

Introduction

Increasing evidences indicate that proteins are involved in almost all life activities, while the functions and importance of different proteins in life activity are different. As an important group of proteins, essential proteins play a vitally important role in the development and survival of organisms, which can provide fundamental requirements for sustaining life and have practical value in synthetic biology. Lack of these proteins will result in the losing of biological function of the protein complex and even death of the organism. Essential proteins are identified mainly through biological experiments, such as single gene knockout RNA interference, conditional knockout, etc. Biological experiments are very time-consuming and expensive. The accuracy for detecting key proteins is still a critical and challenging task [1]–[4]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call