EST-derived simple sequence repeat markers (EST-SSRs) are important tools for studies on genetic diversity, phylogeny, evolution, comparative genomics, QTL analysis, and gene-based associations. We have searched the literature for known EST-SSRs used for Scots pine (Pinus sylvestris L.) – one of the world’s major forest species. Then, 91 of 102 EST-SSRs suggested for Scots pine studies were manually aligned against the reference genome of Pinus taeda L. as well as available genes of P. sylvestris. For 83 EST-SSRs, genome location and consensus putative functions of the associated genes were identified through conservative domain analysis (CDD), functional analysis of known homologs in terms of Gene Ontology annotations, and KEGG pathway analysis. Many of the markers were located in untranslated regions (mostly in 3’-UTR), as well as in coding sequences of Scots pine and loblolly pine genes. For eight markers whose EST sequences were known no genes could be identified in any of the species. Seven of these markers were located in P. taeda scaffold regions carrying no genes in the current genome assembly (v.1.0). The results can be used in the future to improve the choice of markers for population genetic research, studies of adaptive traits, and QTL mapping of P. sylvestris, as well as other pine species.
Read full abstract