Abstract

This paper presents a hidden Markov model method (referred as HMM_AA_SA) for the identification of Helix-Turn-Helix (HTH) DNA-binding motifs. The method takes amino acid sequence and predicted solvent accessibility as input. Solvent accessibility of amino acids is predicted from amino acid sequence and discretized into three categories: buried (B), medium (M) and exposed (E). At each state, HMM_AA_SA emits not only one letter of amino acid but also one letter of solvent accessibility. The method is evaluated using 12 families of HTH motifs from the Pfam. Hidden Markov models are built and tested for each family individually based on three-fold cross validations. The results show that adding predicted solvent accessibility into the model increases the sensitivity by 5.7%, reaching 94.9%. We explore several reduced alphabets of amino acids in order to reduce the complexity of protein sequences and reduce the number of parameters in the model. The results show that using reduced alphabets can not only reduce the number of parameters in the system but also improve the performance. One interesting discovery is that HMM_AA_SA built from a HTH family can identify HTH motifs from other families, suggesting that the HMM_SA_AA method can capture features shared by different families of HTH motifs. This ability is improved when the hidden Markov models are built from the sequence fragments directly involved in the HTH motifs.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.