Abstract

Melaleuca alternifolia is a commercially important medicinal tea tree native to Australia. Tea tree oil, the essential oil distilled from its branches and leaves, has broad-spectrum germicidal activity and is highly valued in the pharmaceutical and cosmetic industries. Thus, the study of genome, which can provide reference for the investigation of genes involved in terpinen-4-ol biosynthesis, is quite crucial for improving the productivity of Tea tree oil. In our study, the next-generation sequencing was used to investigate the whole genome of Melaleuca alternifolia. About 114Gb high quality sequence data were obtained and assembled into 1,838,159 scafolds with an N50 length of 1021bp. The assembled genome size is about 595Mb, twice of that predicted by flow cytometer (300Mb) and k-mer analysis (345Mb). Benchmarking Universal Single-Copy Orthologs analyses indicated that only 11.3% of the conserved single-copy genes were miss. Repetitive regions cover over 40.43% of the genome. A total of 44,369 protein-coding genes were predicted and annotated against Nr, Swissprot, Refseq, COG, KOG, and KEGG database. Among these genes, 32,909 and 16,241 genes were functionally annotated in Nr and KEGG, respectively. Moreover, 29,411 and 14,435 genes were functionally annotated in COG and KOG. Additionally, 457,661 simple sequence repeats and 1109 transcription factors (TFs) form 67 TF families were identified in the assembled genome. Our findings provide a draft genome sequencing of M. alternifolia which can act as a reference for the deep sequencing strategies, and are useful for future functional and comparative genomics analyses.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call