Whole-genome sequencing and analysis of the Chinese herbal plant Gelsemium elegans

Yisong Liu,Qi Tang,Pi Cheng,Mingfei Zhu,Hui Zhang,Jiazhe Liu,Mengting Zuo,Chongyin Huang,Changqiao Wu,Zhiliang Sun,Zhaoying Liu

doi:10.1016/j.apsb.2019.08.004

Abstract

BackgroundGelsemium elegans (G. elegans) (2n = 2x = 16) is genus of flowering plants belonging to the Gelsemicaeae family. MethodHere, a high-quality genome assembly using the Oxford Nanopore Technologies (ONT) platform and high-throughput chromosome conformation capture techniques (Hi-C) were used. ResultsA total of 56.11 Gb of raw GridION X5 platform ONT reads (6.23 Gb per cell) were generated. After filtering, 53.45 Gb of clean reads were obtained, giving 160 × coverage depth. The de novo genome assemblies 335.13 Mb, close to the 338 Mb estimated by k-mer analysis, was generated with contig N50 of 10.23 Mb. The vast majority (99.2%) of the G. elegans assembled sequence was anchored onto 8 pseudo-chromosomes. The genome completeness was then evaluated and 1338 of the 1440 conserved genes (92.9%) could be found in the assembly. Genome annotation revealed that 43.16% of the G. elegans genome is composed of repetitive elements and 23.9% is composed of long terminal repeat elements. We predicted 26,768 protein-coding genes, of which 84.56% were functionally annotated. ConclusionThe genomic sequences of G. elegans could be a valuable source for comparative genomic analysis in the Gelsemicaeae family and will be useful for understanding the phylogenetic relationships of the indole alkaloid metabolism.

Full Text