Abstract

Intact transposable elements (TEs) account for 65% of the maize genome and can impact gene function and regulation. Although TEs comprise the majority of the maize genome and affect important phenotypes, genome-wide patterns of TE polymorphisms in maize have only been studied in a handful of maize genotypes, due to the challenging nature of assessing highly repetitive sequences. We implemented a method to use short-read sequencing data from 509 diverse inbred lines to classify the presence/absence of 445,418 nonredundant TEs that were previously annotated in four genome assemblies including B73, Mo17, PH207, and W22. Different orders of TEs (i.e., LTRs, Helitrons, and TIRs) had different frequency distributions within the population. LTRs with lower LTR similarity were generally more frequent in the population than LTRs with higher LTR similarity, though high-frequency insertions with very high LTR similarity were observed. LTR similarity and frequency estimates of nested elements and the outer elements in which they insert revealed that most nesting events occurred very near the timing of the outer element insertion. TEs within genes were at higher frequency than those that were outside of genes and this is particularly true for those not inserted into introns. Many TE insertional polymorphisms observed in this population were tagged by SNP markers. However, there were also 19.9% of the TE polymorphisms that were not well tagged by SNPs (R2 < 0.5) that potentially represent information that has not been well captured in previous SNP-based marker-trait association studies. This study provides a population scale genome-wide assessment of TE variation in maize and provides valuable insight on variation in TEs in maize and factors that contribute to this variation.

Highlights

  • Transposable elements (TEs) are present in all eukaryotic genomes (Bennetzen 2000; Wicker et al 2007)

  • We implemented an approach for scoring transposable elements (TEs) presence/absence variation from short-read sequence alignment using the average coverage of windows within the boundaries of previously annotated TEs in a random forest machine learning model

  • Presence/absence scores defined by the previous comparison of TE content generated for four maize genome assemblies including B73, Mo17, PH207, and W22 were used as true positive (Anderson et al 2019)

Read more

Summary

Introduction

Transposable elements (TEs) are present in all eukaryotic genomes (Bennetzen 2000; Wicker et al 2007). Long terminal repeat retrotransposons (LTR) are the most abundant type of retrotransposons in maize (Bennetzen 2000) and intact elements account for over half of the maize genome by sequence length (Anderson et al 2019; Stitzer et al 2019). TIRs are defined by TIR sequences at both ends of the TE (Wicker et al 2007) and intact TIRs make up around 3% of the maize genome (Anderson et al 2019). Helitrons are defined by their “rolling circle” replication mechanism (Lisch 2013) and intact Helitrons make up around 4% of the maize genome (Anderson et al 2019).

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call