Abstract

BackgroundSimple sequence repeats (SSRs) are tandemly repeated sequence motifs common in genomic nucleotide sequence that often harbor significant variation in repeat number. Frequently used as molecular markers, SSRs are increasingly identified via in silico approaches. Two common classes of genomic resources that can be mined are bacterial artificial chromosome (BAC) libraries and expressed sequence tag (EST) libraries.Results288 SSR loci were screened in the rapidly radiating Hawaiian swordtail cricket genus Laupala. SSRs were more densely distributed and contained longer repeat structures in BAC library-derived sequence than in EST library-derived sequence, although neither repeat density nor length was exceptionally elevated despite the relatively large genome size of Laupala. A non-random distribution favoring AT-rich SSRs was observed. Allelic diversity of SSRs was positively correlated with repeat length and was generally higher in AT-rich repeat motifs.ConclusionThe first large-scale survey of Orthopteran SSR allelic diversity is presented. Selection contributes more strongly to the size and density distributions of SSR loci derived from EST library sequence than from BAC library sequence, although all SSRs likely are subject to similar physical and structural constraints, such as slippage of DNA replication machinery, that may generate increased allelic diversity in AT-rich sequence motifs. Although in silico approaches work well for SSR locus identification in both EST and BAC libraries, BAC library sequence and AT-rich repeat motifs are generally superior SSR development resources for most applications.

Highlights

  • Simple sequence repeats (SSRs) are tandemly repeated sequence motifs common in genomic nucleotide sequence that often harbor significant variation in repeat number

  • 186 simple sequence repeat (SSR) loci were identified in bacterial artificial chromosome (BAC) library sequence (1.71 Mb) and 550 in expressed sequence tag (EST) library sequence (10.17 Mb); of these, we were able to design primers in flanking sequence for 135 and 435 loci, respectively

  • Distribution of SSRs in genomic library sequence To test whether SSRs were more likely to involve a particular sequence motif, we developed a posterior probability distribution of each di- and trinucleotide repeat

Read more

Summary

Introduction

Simple sequence repeats (SSRs) are tandemly repeated sequence motifs common in genomic nucleotide sequence that often harbor significant variation in repeat number. Many features that shape genome evolution generally, such as nucleotide composition, may play a large role in the variability of microsatellite density across the genome SSRs located within coding regions tend to have an excess of trinucleotide repeats relative to other repeat classes and a specific excess of (CAG) n SSR loci [18]. This is generally attributed (1) to the fact that length variant trinucleotide SSRs maintain the appropriate reading frame within the coding region and (2) to the observation that glutamine (CAG) repeats have fewer detrimental effects within a protein than many other repeated amino acids [19]. The number of allelic length variants associated with an SSR locus typically increases with increasing average repeat number at that locus [22,23] (but see [24]), and allelic diversity is thought to be primarily a consequence of physical parameters and structural properties of the SSR sequence motif [15]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call