Abstract

A codon-based approach to estimating the number of variable sites in a protein is presented. When first and second positions of codons are assumed to be replacement positions, a capture-recapture model can be used to estimate the number of variable codons from every pair of homologous and aligned sequences. The capture-recapture estimate is compared to a maximum likelihood estimate of the number of variable codons and to previous approaches that estimate the number of variable sites (not codons) in a sequence. Computer simulations are presented that show under which circumstances the capture-recapture estimate can be used to correct biases in distance matrices. Analysis of published sequences of two genes, calmodulin and serum albumin, shows that distance corrections that employ a capture-recapture estimate of the number of variable sites may be considerably different from corrections that assume that the number of variable sites is equal to the total number of positions in the sequence.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.