Abstract

In the post-genome period, the protein domain structures are published rapidly, but they have not been studied comprehensively. To figure out the cell function, the protein---DNA interactions decrypt the protein domain structures in recent research. Several machine-learning based methods are applied to the issue; however, they are not efficient to translate the tertiary structure characteristics of proteins into appropriate features for predicting the DNA-binding proteins. In this work, a novel machine-learning approach based on hidden Markov models identifies the characteristics of DNA-binding proteins with their amino acid sequences and tertiary structures. After we distill the features from DNA-binding proteins, a support vector machine based classifier predicts general DNA-binding proteins with the accuracy of 88.45 % through fivefolds cross-validation. Furthermore, we construct a response element specific classifier for predicting response element specific DNA-binding proteins, and the performance achieves the precision of 96.57 % with recall rate as 88.83 % in average. To verify the prediction of DNA-binding proteins, we used the DNA-binding proteins from MCF-7 that are likely to bind with estrogen response elements (ERE), and the results show that our methods can apply to practice.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.