Abstract

The recent worldwide spreading of pneumonia-causing virus, such as Coronavirus, COVID-19, and H1N1, has been endangering the life of human beings all around the world. In order to really understand the biological process within a cell level and provide useful clues to develop antiviral drugs, information of virus protein subcellular localization is vitally important. In view of this, a CNN based virus protein subcellular localization predictor called “pLoc_Deep-mVirus” was developed. The predictor is particularly useful in dealing with the multi-sites systems in which some proteins may simultaneously occur in two or more different organelles that are the current focus of pharmaceutical industry. The global absolute true rate achieved by the new predictor is over 97% and its local accuracy is over 98%. Both are transcending other existing state-of-the-art predictors significantly. It has not escaped our notice that the deep-learning treatment can be used to deal with many other biological systems as well. To maximize the convenience for most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc_Deep-mVirus/.

Highlights

  • INTRODUCTIONKnowledge of the subcellular localization of proteins is crucially important for fulfilling the following two important goals: 1) revealing the intricate pathways that regulate biological processes at the cellular level [1, 2]. 2) selecting the right targets [3] for developing new drugs

  • 1) Most existing protein subcellular location prediction methods were developed based on the single-label system in which it was assumed that each constituent protein had one, and only one, subcellular location

  • To make them more intuitive and easier to understand for most experimental scientists, here we use the following intuitive Chou’s five metrics [42] or the “global metrics” that have recently been widely used for studying various multi-label systems

Read more

Summary

INTRODUCTION

Knowledge of the subcellular localization of proteins is crucially important for fulfilling the following two important goals: 1) revealing the intricate pathways that regulate biological processes at the cellular level [1, 2]. 2) selecting the right targets [3] for developing new drugs. In 2019, a very powerful predictor, called “pLoc_bal-mVirus” [4], was developed for predicting the subcellular localization of virus proteins based on their sequences information alone. As done in pLoc_bal-mVirus [4] as well as many other recent publications in developing new prediction methods (see, e.g., [15, 16]), the guidelines of the 5-step rule [17] are followed They are about the detailed procedures for 1) benchmark dataset, 2) sample formulation, 3) operation engine or algorithm, 4) cross-validation, and 5) web-server. Here our attentions are focused on the procedures that significantly differ from those in developing the predictor pLoc_bal-mVirus [4]

Benchmark Dataset
Proteins Sample Formulation
Installing Deep-Learning for Three Deeper Levels
RESULTS AND DISCUSSION
A Set of Five Metrics for Multi-Label Systems
Comparison with the State-of-the-Art Predictor
Host call membrane
Web Server and User Guide
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.