Assembly of complete genomes can reveal functional genetic elements missing from draft sequences. Here we present the near-complete telomere-to-telomere and contiguous genome of the cotton species Gossypium raimondii. Our assembly identified gaps and misoriented or misassembled regions in previous assemblies and produced 13 centromeres, with 25 chromosomal ends having telomeres. In contrast to satellite-rich Arabidopsis and rice centromeres, cotton centromeres lack phased CENH3 nucleosome positioning patterns and probably evolved by invasion from long terminal repeat retrotransposons. In-depth expression profiling of transposable elements revealed a previously unannotated DNA transposon (MuTC01) that interacts with miR2947 to produce trans-acting small interfering RNAs (siRNAs), one of which targets the newly evolved LEC2 (LEC2b) to produce phased siRNAs. Systematic genome editing experiments revealed that this tripartite module, miR2947-MuTC01-LEC2b, controls the morphogenesis of complex folded embryos characteristic of Gossypium and its close relatives in the cotton tribe. Our study reveals a trans-acting siRNA-based tripartite regulatory pathway for embryo development in higher plants.
Read full abstract