Abstract

Simple sequence repeats (SSRs) can be derived from the complete genome sequence. These markers are important for gene mapping as well as marker-assisted selection (MAS). To develop SSRs for cotton gene mapping, we selected the complete genome sequence of Gossypium raimondii, which consisted of 4447 non-redundant scaffolds. Out of 775.2 Mb sequence examined, a total of 136,345 microsatellites were identified with a density of 5.69 kb per SSR in the G. raimondii genome leading to development of 112,177 primer pairs. The distributions of SSRs in the genome were non-random. Among the different motifs ranging from 1 to 6 bp, penta-nucleotide repeats were most abundant (30.5%), followed by tetra-nucleotide repeats (18.2%) and di-nucleotide repeats (16.9%). Among all identified 457 motif types, the most frequently occurring repeat motifs were poly-AT/TA, which accounted for 79.8% of the total di-nt SSRs, followed by AAAT/TTTA with 51.5% of the total tetra-nucleotede. Further, 18,834 microsatellites were detected from the protein-coding genes, and the frequency of gene containing SSRs was 46.0% in 40,976 genes of G. raimondii. These genome-based SSRs developed in the present study will lay the groundwork for developing large numbers of SSR markers for genetic mapping, gene discovery, genetic diversity analysis, and MAS breeding in cotton.

Highlights

  • Simple sequence repeats (SSRs) are tandemly repeated DNA motifs (1-6 bp long) which are present in both protein coding and non-coding regions of DNA sequences, and show a high level of length polymorphism due to mutations of one or more repeats

  • These genome-based SSRs developed in the present study will lay the groundwork for developing large numbers of SSR markers for genetic mapping, gene discovery, genetic diversity analysis, and marker-assisted selection (MAS) breeding in cotton

  • Up to now many works have been reported regarding the application of molecular markers in this plant for genetic mapping, gene discovery, genetic diversity analysis, and MAS

Read more

Summary

Introduction

Simple sequence repeats (SSRs) are tandemly repeated DNA motifs (1-6 bp long) which are present in both protein coding and non-coding regions of DNA sequences, and show a high level of length polymorphism due to mutations of one or more repeats. SSRs are easy to use and analyze by virtue of their multiallelic nature, reproducibility, high abundance and extensive genome coverage [1, 2]. The traditional methods of developing SSR markers are usually time consuming and laborintensive. These processes involve genomic library construction, hybridization with the repeated units of nucleotides and sequencing of the clones. The computational approach for developing SSR markers from the genome sequence provides a better platform than the conventional approach. Several bioinformatic tools for the identification of microsatellites in genomic sequences have been developed. The most commonly used tools for SSR search are: SSRIT [3], ISSN 0973-2063 (online) 0973-8894 (print)

Methods
Findings
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call