Abstract

Profile HMMs based on classical hidden Markov models have been widely applied for alignment and classification of protein sequence families. The formulation of the forward and backward variables in profile HMMs is made under statistical independence assumption of the probability theory. We propose a fuzzy profile hidden Markov model to overcome the limitations of the statistical independence assumption of probability theory. The strong correlations and the sequence preference involved in the protein structures make fuzzy architecture based models as suitable candidates for building profiles of a given family since fuzzy set can handle uncertainties better than classical methods. The proposed model fuzzifies the forward and backward variables by incorporating Sugeno fuzzy measures using Choquet integrals which is extended to fuzzy Baum-Welch parameter estimation algorithm for profiles. It was built and tested on widely studied globin and kinase family sequences and its performance was compared with classical HMM. A comparative analysis based on Log-Likelihood (LL) scores of sequences and Receiver Operating Characteristic (ROC) demonstrates the superiority of fuzzy profile HMMs over the classical profile model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call