Abstract

Simple repeats composed of tandemly repeated units 1-6 nucleotides (nt) long have been extracted from a selected set of primate genomic DNA sequences. Of the 501 theoretically possible, different types of repeats only 67 were present in the analyzed database in at least two different size ranges over 12 nt. They include all simple repeats known to be polymorphic in the primate genome. A list of moderately expanding and nonexpanding oligonucleotide patterns has also been included. Furthermore, we have compiled statistical data with emphasis on the overall variability of the most abundant 67 types of repeats. We have demonstrated that the expandability of at least some simple repeats may be affected by the overall base composition and by flanking sequences. In particular, the occurrence of tandemly repeated CAG and GCC triplets in exons positively correlates with their G+C content. We also noted that in the vicinity of Alu sequences tetrameric repeats are more abundant than in the total genomic DNA. This paper can be used as a comprehensive guide in identification of the most abundant and potentially polymorphic simple repeats. It is also of broader significance as a step toward understanding the contribution of flanking sequences and the overall sequence composition to variability of simple repeats.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.