Abstract

BackgroundSimple sequence repeats (SSRs) are defined as sequence repeat units between 1 and 6 bp that occur in both coding and non-coding regions abundant in eukaryotic genomes, which may affect the expression of genes. In this study, expressed sequence tags (ESTs) of eight Prunus species were analyzed for in silico mining of EST-SSRs, protein annotation, and open reading frames (ORFs), and the identification of codon repetitions.ResultsA total of 316 SSRs were identified using MISA software. Dinucleotide SSR motifs (26.31 %) were found to be the most abundant type of repeats, followed by tri- (14.58 %), tetra- (0.53 %), and penta- (0.27 %) nucleotide motifs. An attempt was made to design primer pairs for 316 identified SSRs but these were successful for only 175 SSR sequences. The positions of SSRs with respect to ORFs were detected, and annotation of sequences containing SSRs was performed to assign function to each sequence. SSRs were also characterized (in terms of position in the reference genome and associated gene) using the two available Prunus reference genomes (mei and peach). Finally, 38 SSR markers were validated across peach, almond, plum, and apricot genotypes. This validation showed a higher transferability level of EST-SSR developed in P. mume (mei) in comparison with the rest of species analyzed.ConclusionsFindings will aid analysis of functionally important molecular markers and facilitate the analysis of genetic diversity.Electronic supplementary materialThe online version of this article (doi:10.1186/s13104-016-2143-y) contains supplementary material, which is available to authorized users.

Highlights

  • Simple sequence repeats (SSRs) are defined as sequence repeat units between 1 and 6 bp that occur in both coding and non-coding regions abundant in eukaryotic genomes, which may affect the expression of genes

  • expressed sequence tags (ESTs) retrieved from NCBI were mined for simple sequence repeats (SSRs), which were characterized and a subset for marker design

  • Development and application of molecular markers is of immense importance in the examination of the genetic composition, inter-species variability, and evolutionary relationships of Prunus species

Read more

Summary

Introduction

Simple sequence repeats (SSRs) are defined as sequence repeat units between 1 and 6 bp that occur in both coding and non-coding regions abundant in eukaryotic genomes, which may affect the expression of genes. Recent molecular phylogenetic studies have concluded that this genus is divided into three important subgenera (Amygdalus, Cerasus and Prunus) including species with high economic value which produce edible drupes or seeds. Processes involve genomic library construction, hybridization with the repeated units of nucleotides, and sequencing of the clones These traditional methods have been applied in Prunus species in the development of SSR-ESTs in peach [4, 5], apricot [6, 7], almond [8, 9] and mei [10, 11]. EST databases store expressed sequences that are redundant, so they contain repetitive units [12] Such computational approaches have been recently applied in Prunus species, albeit only in the reference peach genome [13, 14]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call