A set of 899 L. gmelinii expression sequence tags (ESTs), available at the National Center of Biotechnology Information (NCBI), was employed to address the feasibility on development of simple sequence repeat (SSR) markers for Larch species. Totally, 634 non-redundant unigenes including 145 contigs and 489 singletons were finally identified and mainly involved in biosynthetic, metabolic processes and response to stress according to BLASTX results, gene ontology (GO) categories and Kyoto Encyclopedia of Genes and Genomes (KEGG) maps. Approximately 11.7% (74) unigenes contained 90 candidate SSRs, which were mainly trinucleotides (29, 32.2%) and dinucleotides (26, 28.9%). A relatively high frequency of SSRs was respectively found in the Open Reading Frame (ORF, about 54.4%) and 5′-untranslated region (5′-UTR, 31.2%), while a low frequency was observed in the 3′-untranslated region (3′-UTR, about 14.4%). Of the 45 novel EST-SSRs markers, nine were found to be polymorphic at two L. gmelinii populations. The number of alleles per locus (Na) ranged from two to four, and the observed (Ho) and expected (He) heterozygosity values were 0.200–0.733 and 0.408–0.604, respectively. The inbreeding coefficients (FIS) for all loci were more than zero except Lg41. Most of these 9EST-SSR markers were transferable to its related species L. kaempferi, L. principis-rupprechtii and L. olgensis. These novel EST-SSRs will be useful for further research on comparative genomics, genetic resources conservation and molecular breeding in larch trees.
Read full abstract