Abstract

This note considers sampling theory for a selectively neutral locus where it is supposed that the data provide nucleotide sequences for the genes sampled. It thus anticipates that technical advances will soon provide data of this form in volume approaching that currently obtained from electrophoresis. The assumption made on the nature of the data will require us to use, in the terminology of Kimura ( Theor. Pop. Biol. 2, 174–208 (1971) ), the “infinite sites” model of Karlin and McGregor ( Proc. Fifth Berkeley Symp. Math. Statist. Prob. 4, 415–438 (1967) ) rather that the “infinite alleles” model of Kimura and Crow ( Genetics 49, 174–738 (1964) ). We emphasize that these two models refer not to two different real-world circumstances, but rather to two different assumptions concerning our capacity to investigate the real world. We compare our results where appropriate with corresponding sampling theory of Ewens ( Theor. Pop. Biol. 3, 87–112 (1972) ) for the “infinite alleles” model. Note finally that some of our results depend on an assumption of independence of behavior at individual sites; a parallel paper by Watterson (submitted for publication (1974)) assumes no recombination between sites. Real-world behavior will lie between these two assumptions, closer to the situation assumed by Watterson than in this note. Our analysis provides upper bounds for increased efficiency in using complete nucleotide sequences.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.