Abstract

Although, existing methods have been successful in predicting phage (or bacteriophage) virion proteins (PVPs) using various types of protein features and complex classifiers, such as support vector machine and naïve Bayes, these two methods do not allow interpretability. However, the characterization and analysis of PVPs might be of great significance to understanding the molecular mechanisms of bacteriophage genetics and the development of antibacterial drugs. Hence, we herein proposed a novel method (PVPred-SCM) based on the scoring card method (SCM) in conjunction with dipeptide composition to identify and characterize PVPs. In PVPred-SCM, the propensity scores of 400 dipeptides were calculated using the statistical discrimination approach. Rigorous independent validation test showed that PVPred-SCM utilizing only dipeptide composition yielded an accuracy of 77.56%, indicating that PVPred-SCM performed well relative to the state-of-the-art method utilizing a number of protein features. Furthermore, the propensity scores of dipeptides were used to provide insights into the biochemical and biophysical properties of PVPs. Upon comparison, it was found that PVPred-SCM was superior to the existing methods considering its simplicity, interpretability, and implementation. Finally, in an effort to facilitate high-throughput prediction of PVPs, we provided a user-friendly web-server for identifying the likelihood of whether or not these sequences are PVPs. It is anticipated that PVPred-SCM will become a useful tool or at least a complementary existing method for predicting and analyzing PVPs.

Highlights

  • The existence of viruses that can infect and multiply only in bacteria, known as bacteriophages, can be found in several environments such as soil, freshwater, and marine.that is surrounded by a protein coat [1]

  • It can be seen that four steps are involved in the development of this method as follows: (i) collecting both benchmark and independent datasets, (ii) calculating init-DPS using a statistical approach, (iii) optimizing init-DPS to obtain opti-DPS using the genetic algorithm (GA), (iv) Phage structural (virion) proteins (PVPs) prediction, and v) PVP characterization using the propensity scores of dipeptides

  • To serve easy and rapid classification of query protein sequence, PVPred-scoring card method (SCM) was utilized as a free prediction web server for discriminating PVPs and non-PVPs

Read more

Summary

Introduction

That is surrounded by a protein coat (capsid) [1]. These bacterial viruses are very species-specific with regards to their host, which is a single bacterial strain or species. Bacteriophages irreversibly attach themselves to the surface of a susceptible host, insert their genetic information and persist using two possible strategies: lytic or lysogenic life cycle [2]. Due to their characteristics, lack of toxicity. It is necessary to develop a computational model for discriminating PVPs from non-PVPs, and for characterizing the biochemical and biophysical properties of PVPs

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call