SPAR: a random forest-based predictor for self-interacting proteins with fine-grained domain information

Xuhan Liu,Shiping Yang,Ziding Zhang,Chen Li,Jiangning Song

doi:10.1007/s00726-016-2226-z

Abstract

Protein self-interaction, i.e. the interaction between two or more identical proteins expressed by one gene, plays an important role in the regulation of cellular functions. Considering the limitations of experimental self-interaction identification, it is necessary to design specific bioinformatics tools for self-interacting protein (SIP) prediction from protein sequence information. In this study, we proposed an improved computational approach for SIP prediction, termed SPAR (Self-interacting Protein Analysis serveR). Firstly, we developed an improved encoding scheme named critical residues substitution (CRS), in which the fine-grained domain-domain interaction information was taken into account. Then, by employing the Random Forest algorithm, the performance of CRS was evaluated and compared with several other encoding schemes commonly used for sequence-based protein-protein interaction prediction. Through the tenfold cross-validation tests on a balanced training dataset, CRS performed the best, with the average accuracy up to 72.01%. We further integrated CRS with other encoding schemes and identified the most important features using the mRMR (the minimum redundancy maximum relevance) feature selection method. Our SPAR model with selected features achieved an average accuracy of 92.09% on the human-independent test set (the ratio of positives to negatives was about 1:11). Besides, we also evaluated the performance of SPAR on an independent yeast test set (the ratio of positives to negatives was about 1:8) and obtained an average accuracy of 76.96%. The results demonstrate that SPAR is capable of achieving a reasonable performance in cross-species application. The SPAR server is freely available for academic use at http://systbio.cau.edu.cn/zzdlab/spar/ .

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

SPAR: a random forest-based predictor for self-interacting proteins with fine-grained domain information

Abstract

Talk to us

Similar Papers

More From: Amino Acids

Lead the way for us

Journal: Amino Acids	Publication Date: Apr 13, 2016
Citations: 26

Similar Papers

On some aspects of minimum redundancy maximum relevance feature selection
Peter Bugata ... Peter Drotar
Science China Information Sciences | VOL. 63
Peter Bugata, et. al.Peter Bugata ... Peter Drotar
24 Dec 2019
Science China Information Sciences | VOL. 63

Detection of lung cancer on chest CT images using minimum redundancy maximum relevance feature selection method with convolutional neural networks
Mesut Toğaçar ... Zafer Cömert
Biocybernetics and Biomedical Engineering | VOL. 40
Mesut Toğaçar, et. al.Mesut Toğaçar ... Zafer Cömert
23 Nov 2019
Biocybernetics and Biomedical Engineering | VOL. 40

Epileptic seizure prediction based on a bivariate spectral power methodology
M Bandarabadi ... B Direito
-
M Bandarabadi, et. al.M Bandarabadi ... B Direito
01 Aug 2012
01 Aug 2012

A Classification Feature Optimization Method for Remote Sensing Imagery Based on Fisher Score and mRMR
Chengzhe Lv ... Miao Lu
Applied Sciences | VOL. 12
Chengzhe Lv, et. al.Chengzhe Lv ... Miao Lu
02 Sep 2022
Applied Sciences | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

SPAR: a random forest-based predictor for self-interacting proteins with fine-grained domain information

Abstract

Talk to us

Similar Papers

More From: Amino Acids