PLoc_Deep-mHum: Predict Subcellular Localization of Human Proteins by Deep Learning

Yu-Tao Shao,Kuo-Chen Chou,Zhe Lu,Xin-Xin Liu

doi:10.4236/ns.2020.127042

Yu-Tao Shao, Kuo-Chen Chou + Show 2 more

Open Access

https://doi.org/10.4236/ns.2020.127042

Copy DOI

Journal: Natural Science	Publication Date: Jan 1, 2020
Citations: 6	License type: CC BY 4.0

Affiliation: Jingdezhen Ceramic Institute

Abstract

Recently, the life of human beings around the entire world has been endangering by the spreading of pneumonia-causing virus, such as Coronavirus, COVID-19, and H1N1. To develop effective drugs against Coronavirus, knowledge of protein subcellular localization is indispensable. In 2019, a predictor called “pLoc_bal-mHum” was developed for identifying the subcellular localization of human proteins. Its predicted results are significantly better than its counterparts, particularly for those proteins that may simultaneously occur or move between two or more subcellular location sites. However, more efforts are definitely needed to further improve its power since pLoc_bal-mHum was still not trained by a “deep learning”, a very powerful technique developed recently. The present study was devoted to incorporate the “deep-learning” technique and develop a new predictor called “pLoc_Deep-mHum”. The global absolute true rate achieved by the new predictor is over 81% and its local accuracy is over 90%. Both are overwhelmingly superior to its counterparts. Moreover, a user-friendly web-server for the new predictor has been well established at http://www.jci-bioinfo.cn/pLoc_Deep-mHum/, which will become a very useful tool for fighting pandemic coronavirus and save the mankind of this planet.

Highlights

The strong point of this model is that it allows extracting the maximum amount of information from human protein features using CNN convolution layers
The new predictor developed via the above procedures is called “pLoc_Deep-mHum”, where “pLoc_Deep” stands for “predict subcellular localization by deep learning”, and “mHum” for “multi-label human proteins”
The newly proposed predictor pLoc_Deep-mHum is remarkably superior to the existing state-of-the-art predictor pLoc_bal-mHum in all the five metrics. It can be seen from the table that the absolute true rate achieved by the new predictor is over 81%, which is far beyond the reach of any other existing methods

Summary

INTRODUCTION

Knowledge of the subcellular localization of proteins is crucially important for fulfilling the following two important goals: 1) revealing the intricate pathways that regulate biological processes at the cellular level [1, 2]. 2) selecting the right targets [3] for developing new drugs. In 2011, by extracting the GO (Gene Ontology) information of the proteins [6], the same predictor can be used to deal with multiple locations proteins, achieving 76% accuracy It is through these kinds of procedures and follow-up procedures, that the capacity in dealing with multi-site systems and raising the accuracy is further improved. As done in pLoc_bal-mHum [12] as well as many other recent publications in developing new prediction methods (see, e.g., [9-11, 13-50]), the guidelines of the 5-step rule [51] are followed. They are about the detailed procedures for 1) benchmark dataset, 2) sample formulation, 3) operation engine or algorithm, 4) cross-validation, and 5) web-server. Here our attentions are focused on the procedures that significantly differ from those in developing the predictor pLoc_bal-mHum [12]

Benchmark Dataset

Proteins Sample Formulation

Architecture for the Novel CNN-BiLSTM Network

RESULTS AND DISCUSSION

A Set of Five Metrics for Multi-Label Systems

Comparison with the State-of-the-Art Predictor

Comparison with Several Classic Machine Learning Methods

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

PLoc_Deep-mHum: Predict Subcellular Localization of Human Proteins by Deep Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Natural Science

Lead the way for us

Similar Papers

Support Vector Machine-based Method for Subcellular Localization of Human Proteins Using Amino Acid Compositions, Their Order, and Similarity Search
Aarti Garg ... Gajendra P.S Raghava
Journal of Biological Chemistry | VOL. 280
Aarti Garg, et. al.Aarti Garg ... Gajendra P.S Raghava
01 Apr 2005
Journal of Biological Chemistry | VOL. 280

PLoc_Deep-mEuk: Predict Subcellular Localization of Eukaryotic Proteins by Deep Learning
Yutao Shao ... Kuo-Chen Chou
Natural Science | VOL. 12
Yutao Shao, et. al.Yutao Shao ... Kuo-Chen Chou
01 Jan 2020
Natural Science | VOL. 12

Predicting the Subcellular Localization of Human Proteins Using Machine Learning and Exploratory Data Analysis
George K Acquaah-Mensah ... Chittibabu Guda
Genomics, Proteomics & Bioinformatics | VOL. 4
George K Acquaah-Mensah, et. al.George K Acquaah-Mensah ... Chittibabu Guda
01 Jun 2006
Genomics, Proteomics & Bioinformatics | VOL. 4

ILoc-Hum: using the accumulation-label scale to predict subcellular locations of human proteins with both single and multiple sites
Kuo-Chen Chou ... Zhi-Cheng Wu
Mol. BioSyst. | VOL. 8
Kuo-Chen Chou, et. al.Kuo-Chen Chou ... Zhi-Cheng Wu
01 Jan 2012
Mol. BioSyst. | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

PLoc_Deep-mHum: Predict Subcellular Localization of Human Proteins by Deep Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Natural Science