Protein functional properties prediction in sparsely-label PPI networks through regularized non-negative matrix factorization.

Qingyao Wu,Yunming Ye,Yueping Li,Chunshan Li,Ning Sun,Zhenyu Wang

doi:10.1186/1752-0509-9-s1-s9

Abstract

BackgroundPredicting functional properties of proteins in protein-protein interaction (PPI) networks presents a challenging problem and has important implication in computational biology. Collective classification (CC) that utilizes both attribute features and relational information to jointly classify related proteins in PPI networks has been shown to be a powerful computational method for this problem setting. Enabling CC usually increases accuracy when given a fully-labeled PPI network with a large amount of labeled data. However, such labels can be difficult to obtain in many real-world PPI networks in which there are usually only a limited number of labeled proteins and there are a large amount of unlabeled proteins. In this case, most of the unlabeled proteins may not connected to the labeled ones, the supervision knowledge cannot be obtained effectively from local network connections. As a consequence, learning a CC model in sparsely-labeled PPI networks can lead to poor performance.ResultsWe investigate a latent graph approach for finding an integration latent graph by exploiting various latent linkages and judiciously integrate the investigated linkages to link (separate) the proteins with similar (different) functions. We develop a regularized non-negative matrix factorization (RNMF) algorithm for CC to make protein functional properties prediction by utilizing various data sources that are available in this problem setting, including attribute features, latent graph, and unlabeled data information. In RNMF, a label matrix factorization term and a network regularization term are incorporated into the non-negative matrix factorization (NMF) objective function to seek a matrix factorization that respects the network structure and label information for classification prediction.ConclusionExperimental results on KDD Cup tasks predicting the localization and functions of proteins to yeast genes demonstrate the effectiveness of the proposed RNMF method for predicting the protein properties. In the comparison, we find that the performance of the new method is better than those of the other compared CC algorithms especially in paucity of labeled proteins.

Highlights

Predicting functional properties of proteins in protein-protein interaction (PPI) networks presents a challenging problem and has important implication in computational biology
regularized non-negative matrix factorization (RNMF) performs best followed by semi-iterative classification algorithm (ICA), these two methods are much better than the SVM method only using attribute features and the wvRN+RL only using relational information
We compare the proposed RNMF algorithms with baseline classifiers: SVM, wvRN+RN, ICA, semiICA and ICML

Summary

Introduction

Predicting functional properties of proteins in protein-protein interaction (PPI) networks presents a challenging problem and has important implication in computational biology. Each protein is represented as a feature vector (e.g., textual features from MEDLINE), and the attribute features are taken as input to machine learning algorithms, such as SVM [2], neural networks [3], and random forest [4], to infer annotation rules for predicting the functional properties of unlabeled proteins [5]. These kinds of methods do not consider the function diversification when a protein produces interactions with other ones

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Systems Biology	Publication Date: Jan 21, 2015
Citations: 46	License type: cc-by

R Discovery Prime

R Discovery Prime

Protein functional properties prediction in sparsely-label PPI networks through regularized non-negative matrix factorization.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Systems Biology

Lead the way for us

Similar Papers

Detecting Essential Proteins Based on Network Topology, Gene Expression Data, and Gene Ontology Information.
Wei Zhang ... Yuanyuan Li
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 15
Wei Zhang, et. al.Wei Zhang ... Yuanyuan Li
07 Oct 2016
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 15

The topological features of nonessential-nonhub proteins in the protein-protein interaction network
Dong Yun-Yuan ... Wang Zheng-Hua
-
Dong Yun-Yuan, et. al.Dong Yun-Yuan ... Wang Zheng-Hua
01 Oct 2012
01 Oct 2012

Effectively predicting protein functions by collective classification — An extended abstract
Wei Xiong ... Hui Liu
-
Wei Xiong, et. al.Wei Xiong ... Hui Liu
01 Oct 2012
01 Oct 2012

NMFE-SSCC: Non-negative matrix factorization ensemble for semi-supervised collective classification
Qingyao Wu ... Ning Sun
Knowledge-Based Systems | VOL. 89
Qingyao Wu, et. al.Qingyao Wu ... Ning Sun
17 Jul 2015
Knowledge-Based Systems | VOL. 89

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Protein functional properties prediction in sparsely-label PPI networks through regularized non-negative matrix factorization.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Systems Biology