Abstract

Palmitoylated proteins are important proteins that regulate several biological processes in cells, including protein localization, protein trafficking, and protein-protein interactions. The identification of palmitoylated proteins may therefore contribute to our understanding of important molecular mechanisms and biological functions. In this paper, a simple but effective computational method named MRMD-Palm is proposed for identifying palmitoylated proteins. In developing the method, first, palmitoylated and non-palmitoylated proteins were collected to construct a benchmark dataset. Next, some typical sequence-based feature-encoding schemes related to the amino acid composition, pseudo amino acid composition and physico-chemical properties were extracted. To avoid noise and reduce information redundancy, a feature selection technique named MRMD (Max-Relevance-Max-Distance) was introduced to obtain the optimal feature subset. Finally, the random forest algorithm was utilized as the predictor. Under 10-fold cross-validation, the method achieved a sensitivity of 97.1%, a specificity of 95.9% and a MCC of 73.7%, representing better performance than most current counterparts. Additionally, the experimental results showed that the use of hydrophobicity to encode palmitoylated sequences can improve the performance of the predictor. This finding is consistent with the biological phenomenon that palmitoylation is highly hydrophobic. It is believed that the results of this study will lead to the determination of the protein palmitoylation mechanism.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call