Abstract

The authors apply hidden Markov models to the problem of statistical modeling and multiple sequence alignment of protein families. A variant of the expectation maximization algorithm known as the Viterbi algorithm is used to obtain the statistical model from the unaligned sequences. In a detailed series of experiments, they have taken 400 unaligned globin sequences, and produced a statistical model entirely automatically from the primary sequences. The authors used no prior knowledge of globin structure. Using this model, a multiple alignment of the 400 sequences and 225 other globin sequences was obtained that agrees almost perfectly with a structural alignment by D. Bashford et al. (1987). This model can also discriminate all these 625 globins from nonglobin protein sequences with greater than 99% accuracy, and can thus be used for database searches. >

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call