Abstract

As a kind of post-translational modifications, hydroxylation drew less attention than other modifications, such as phosphorylation and acetylation. However, besides protein stability regulation, it has been found that hydroxylation may affect the activity of proteins. Therefore, it is necessary to better understand the biological processes of hydroxylation. Identification of hydroxylated substrates and their corresponding sites is important for the studies of its molecular mechanism. Fast and convenient computational methods for hydroxylation sites identification are much desired, because experimental approaches are time-consuming and labor-intensive. Here, we present HydLoc (Hydroxylation sites Location), a random forest-based hydroxylation sites predictor for human proteins using sequential information and physicochemical properties. The accuracies of leave-one-out cross-validation on the training dataset are 84.25% and 80.61% for residue proline (P) and lysine (K), respectively. Based on the independent test dataset, it achieved an accuracy of 90.74% and 81.25% for P and K hydroxylation sites prediction, respectively. Meanwhile, the sensitivity values of 96.29% and 75.00% were obtained for residue P and K, which outperforms the existing methods. A user-friendly web server of HydLoc is now available at https://www.gdpu-bioinfolab.com/hydloc/

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call