Abstract
Background Juglans sigillata, or iron walnut, belonging to the order Juglandales, is an economically important tree species in Asia, especially in the Yunnan province of China. However, little research has been conducted on J. sigillata at the molecular level, which hinders understanding of its evolution, speciation, and synthesis of secondary metabolites, as well as its wide adaptability to its plateau environment. To address these issues, a high-quality reference genome of J. sigillata would be useful.FindingsTo construct a high-quality reference genome for J. sigillata, we first generated 38.0 Gb short reads and 66.31 Gb long reads using Illumina and Nanopore sequencing platforms, respectively. The sequencing data were assembled into a 536.50-Mb genome assembly with a contig N50 length of 4.31 Mb. Additionally, we applied BioNano technology to identify contacts among contigs, which were then used to assemble contigs into scaffolds, resulting in a genome assembly with scaffold N50 length of 16.43 Mb and contig N50 length of 4.34 Mb. To obtain a chromosome-level genome assembly, we constructed 1 Hi-C library and sequenced 79.97 Gb raw reads using the Illumina HiSeq platform. We anchored ∼93% of the scaffold sequences into 16 chromosomes and evaluated the quality of our assembly using the high contact frequency heat map. Repetitive elements account for 50.06% of the genome, and 30,387 protein-coding genes were predicted from the genome, of which 99.8% have been functionally annotated. The genome-wide phylogenetic tree indicated an estimated divergence time between J. sigillata and Juglans regia of 49 million years ago on the basis of single-copy orthologous genes.ConclusionsWe provide the first chromosome-level genome for J. sigillata. It will lay a valuable foundation for future research on the genetic improvement of J. sigillata.
Highlights
Juglans sigillata, or iron walnut, belonging to the order Juglandales, is an economically important tree species in Asia, especially in the Yunnan province of China
The combined results of the homology-based and de novo predictions indicated that repeated sequences account for 50.06% of the J. sigillata genome assembly, with long terminal repeats accounting for the greatest proportion (21.42%) (Supplementary Table S10 and Fig. 1)
The detected J. sigillata genes were clustered in families using the OrthoMCL (v2.0.9) pipeline (OrthoMCL DB: Ortholog Groups of Protein Sequences, RRID:SCR 007839) [48], with an E-value cutoff of 1e−5, and Markov chain clustering with a default inflation parameter in an all-to-all BLASTP analysis of entries for 13 species (A. thaliana, B. pendula, Castanea mollissima, Cocos nucifera, E. guineensis, Jatropha curcas, J. regia, O. europaea, P. trichocarpa, Ricinus communis, Sesamum indicum, Solanum lycopersicum, Vitis vinifera)
Summary
Walnut is an important nut fruit with high nutritive value and is grown in temperate climates. Reads from the Illumina DNA library (400 bp) were aligned against the genome assembly using the BWA (BWA, RRID:SCR 010910) and the genome was polished using Pilon 1.22 once again with default parameters, yielding a final draft genome of ∼574.62 Mb, with only 164 gaps, gap length for 5.65% of the genome, and contig and scaffold N50 sizes of 4.34 and 16.43 Mb, respectively (Supplementary Table S7). The detected J. sigillata genes were clustered in families using the OrthoMCL (v2.0.9) pipeline (OrthoMCL DB: Ortholog Groups of Protein Sequences, RRID:SCR 007839) [48], with an E-value cutoff of 1e−5, and Markov chain clustering with a default inflation parameter in an all-to-all BLASTP analysis of entries for 13 species (A. thaliana, B. pendula, Castanea mollissima, Cocos nucifera, E. guineensis, Jatropha curcas, J. regia, O. europaea, P. trichocarpa, Ricinus communis, Sesamum indicum, Solanum lycopersicum, Vitis vinifera). The significantly enriched KEGG pathways included “plant-pathogen interactions” (65 [12.29%]), “mRNA surveillance pathway” (44 [8.31%]), “Phospholipase D signaling pathway” (31 [5.86%]), “Fc gamma Rmediated phagocytosis” (31 [5.86%]), and “cAMP signaling pathway” (31 [5.86%]) (Additional File 3 and Supplementary Fig. S5)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.