Abstract

The present work was designed to classify and differentiate between the dehalogenase enzyme and non– dehalogenases (other hydrolases) by taking the amino acid propensity at the core, surface and both the parts. The data sets were made on an individual basis by selecting the 3D structures of protein available in the PDB (Protein Data Bank). The prediction of the core amino acids were predicted by IPFP tool and their structural propensity calculation was performed by an in-house built software, Propensity Calculator which is available online. All datasets were finally grouped into two categories, namely dehalogenase and non-dehalogenase using Naive Bayes, J-48, Random forest, K-means clustering, and SMO classification algorithm. By making the comparison of various classification methods, the proposed tree method (Random forest) performs well with a classification accuracy of 98.88 % (maximum) for the core propensity data set. Therefore, we proposed that, the core amino acid propensity could be approved as a novel potential descriptor for the classification of enzymes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call