Abstract
There is a wide diversity of bioinformatic tools available for the assembly of next generation sequence and subsequence variant calling to identify genetic markers at scale. Integration of genomics tools such as genomic selection, association studies, pedigree analysis and analysis of genetic diversity, into operational breeding is a goal for New Zealand’s most widely planted exotic tree species, Pinus radiata. In the absence of full reference genomes for large megagenomes such as in conifers, RNA sequencing in a range of genotypes and tissue types, offers a rich source of genetic markers for downstream application. We compared nine different assembler and variant calling software combinations in a single transcriptomic library and found that Single Nucleotide Polymorphism (SNPs) discovery could vary by as much as an order of magnitude (8,061 SNPs up to 86,815 SNPs). The assembler with the best realignment of the packages trialled, Trinity, in combination with several variant callers was then applied to a much larger multi-genotype, multi-tissue transcriptome and identified 683,135 in silico SNPs across a predicted 449,951 exons when mapped to the Pinus taeda ver 1.01e reference.
Highlights
Radiata pine (Pinus radiata D.Don) is New Zealand’s most widely planted exotic forestry species [1] and breeding programmes are moving towards the implementation of genomics technologies to deliver genetic gains through selective breeding for traits of importance
To maximise the number of Single Nucleotide Polymorphism (SNPs) detected, we investigated transcriptomes from a range of tissues and genotypes
For Trees 2, 3, and 4, bud samples were harvested from growing vegetative meristems, preferentially during the early spring flush (Fig 1C), Tree 6 buds were collected in autumn
Summary
Radiata pine (Pinus radiata D.Don) is New Zealand’s most widely planted exotic forestry species [1] and breeding programmes are moving towards the implementation of genomics technologies to deliver genetic gains through selective breeding for traits of importance. Prior to the advent of Generation Sequencing platforms, Expressed Sequence Tag (EST) libraries, based on captured and sequenced cDNA have been a mainstay of gene discovery and functional genomics [2, 3]. Expressed Sequence Tag (EST) libraries have long been a rich resource for markers such as microsatellites or Simple Sequence Repeats (SSRs) [4, 5]. The conserved nature of gene sequence across conifers has meant that frequently, EST based markers from one species can be studied in related species, giving insight to evolutionary processes and increasing the pool of available markers across a genus [6,7,8]. Generation Sequencing (NGS) is changing the face of molecular biology and marker discovery [9,10,11]. At its inception in 2006, the Illumina platform generated average read lengths of 35 bases and 1
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.