Abstract
This chapter discusses the standard approach to sequence alignment using score matrices, gap penalties, and dynamic programming. It also introduces a robustified notion of sequence alignment, based on a simple imprecise probability model for evolutionary distance. The point accepted mutation matrices are widely accepted as the standard scoring system when looking for evolutionary relationships in protein sequences. They are related to the evolution of amino acid sequences described by a Markov model for amino acid substitution. Indels, which introduce alignment gaps, are not modeled by PAM and are treated separately. Using the Markov model for amino acid evolution, a scoring matrix is derived that has the interpretation of a log likelihood ratio. Gap openings are less likely than gap extensions, and therefore the gap opening penalty is chosen substantially higher than the gap extension penalty. The gap penalties should also be chosen relative to the range of scores in the score matrix. When the gap penalty is too high, gaps would never appear in the optimal alignment, and if it is too low, too many gaps would appear in the optimal alignment.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.