The Japanese cedar (Cryptomeria japonica D. Don) is one of the most important Japanese forest trees, occupying approximately 44% of artificial forests and planted in East Asia, the Azores Archipelago, and certain islands in the Indian Ocean. Although the huge genome of the species (ca. 9 Gbp) with abundant repeat elements may have represented an obstacle for genetic analysis, this species is easily propagated by cutting, flowered by gibberellic acid, transformed by Agrobacterium, and edited by CRISPR/Cas9. These characteristics of C. japonica recommend it as a model conifer species for which reference genome sequences are necessary. Herein, we report the first chromosome-level assembly of C. japonica (2n = 22) using third-generation selfed progeny (estimated homozygosity rate = 0.96). Young leaf tissue was used to extract high molecular weight DNA (> 50kb) for HiFi PacBio long-read sequencing and to construct an Hi-C/Omni-C library for Illumina short-read sequencing. The 29× and 26× genome coverage of HiFi and Illumina reads, respectively, for de novo assembly yielded 2,651 contigs (9.1 Gbp, N50 contig size 12.0 Mbp). Hi-C analysis mapped 97% of the nucleotides on 11 chromosomes. The assembly was verified through comparison with a consensus linkage map comprising 7,781 markers. BUSCO analysis identified ∼ 91% conserved genes. Annotations of genes and comparisons of repeat elements with other Cupressaceae and Pinaceae species provide a fundamental resource for conifer research.
Read full abstract