Abstract
BackgroundMany centrality measures have been proposed to mine and characterize the correlations between network topological properties and protein essentiality. However, most of them show limited prediction accuracy, and the number of common predicted essential proteins by different methods is very small.ResultsIn this paper, an ensemble framework is proposed which integrates gene expression data and protein-protein interaction networks (PINs). It aims to improve the prediction accuracy of basic centrality measures. The idea behind this ensemble framework is that different protein-protein interactions (PPIs) may show different contributions to protein essentiality. Five standard centrality measures (degree centrality, betweenness centrality, closeness centrality, eigenvector centrality, and subgraph centrality) are integrated into the ensemble framework respectively. We evaluated the performance of the proposed ensemble framework using yeast PINs and gene expression data. The results show that it can considerably improve the prediction accuracy of the five centrality measures individually. It can also remarkably increase the number of common predicted essential proteins among those predicted by each centrality measure individually and enable each centrality measure to find more low-degree essential proteins.ConclusionsThis paper demonstrates that it is valuable to differentiate the contributions of different PPIs for identifying essential proteins based on network topological characteristics. The proposed ensemble framework is a successful paradigm to this end.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-016-1166-7) contains supplementary material, which is available to authorized users.
Highlights
Many centrality measures have been proposed to mine and characterize the correlations between network topological properties and protein essentiality
We found that the interactions between essential proteins (IBEPs), whose co-expression weights larger than 0.75, account for about 3.9 and 3 % yeast protein-protein interaction (PPI) based on BioGRID and DIP respectively
Most of them are based on the topological properties of proteinprotein interaction networks (PINs) and only have limited prediction accuracy
Summary
Many centrality measures have been proposed to mine and characterize the correlations between network topological properties and protein essentiality. Genome-wide gene deletion studies show that a small fraction of genes in a genome are indispensable to the survival or reproduction of an organism [1, 2]. These genes are referred to as essential genes, and essential proteins are just the products of essential genes. Studies have shown that essential genes contribute to a diverse spectrum of diseases [3, 4]. The identification of them is very important for understanding the minimal requirements for survival of an organism, and for finding human disease genes [3, 4] and new drug targets [5, 6].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.