Abstract

Viola is a large genus with worldwide distribution and many traits not currently exemplified in model plants including unique breeding systems and the production of cyclotides. Here we report de novo genome assembly and transcriptomic analyses of the non-model species Viola pubescens using short-read DNA sequencing data and RNA-Seq from eight diverse tissues. First, V. pubescens genome size was estimated through flow cytometry, resulting in an approximate haploid genome of 455 Mbp. Next, the draft V. pubescens genome was sequenced and assembled resulting in 264,035,065 read pairs and 161,038 contigs with an N50 length of 3,455 base pairs (bp). RNA-Seq data were then assembled into tissue-specific transcripts. Together, the DNA and transcript data generated 38,081 ab initio gene models which were functionally annotated based on homology to Arabidopsis thaliana genes and Pfam domains. Gene expression was visualized for each tissue via principal component analysis and hierarchical clustering, and gene co-expression analysis identified 20 modules of tissue-specific transcriptional networks. Some of these modules highlight genetic differences between chasmogamous and cleistogamous flowers and may provide insight into V. pubescens’ mixed breeding system. Orthologous clustering with the proteomes of A. thaliana and Populus trichocarpa revealed 8,531 sequences unique to V. pubescens, including 81 novel cyclotide precursor sequences. Cyclotides are plant peptides characterized by a stable, cyclic cystine knot motif, making them strong candidates for drug scaffolding and protein engineering. Analysis of the RNA-Seq data for these cyclotide transcripts revealed diverse expression patterns both between transcripts and tissues. The diversity of these cyclotides was also highlighted in a maximum likelihood protein cladogram containing V. pubescens cyclotides and published cyclotide sequences from other Violaceae and Rubiaceae species. Collectively, this work provides the most comprehensive sequence resource for Viola, offers valuable transcriptomic insight into V. pubescens, and will facilitate future functional genomics research in Viola and other diverse plant groups.

Highlights

  • The genus Viola is distributed in both the northern and southern temperate regions as well as the tropics and possesses high diversity with 580–620 species, extensive allopolyploidy, and a distinct cytogenetic evolutionary history (Ballard et al, 1998; Marcussen et al, 2012, 2015; Wahlert et al, 2014)

  • Out of the 248 core eukaryotic genes in core eukaryotic genes mapping approach (CEGMA), 233 (94%) partial matches and 188 (76%) complete matches were found in the V. pubescens genome (Table 1)

  • We describe the de novo assembly and annotation of the V. pubescens genome from 26.6 Gbp of short-read DNA-Seq

Read more

Summary

Introduction

The genus Viola (violets) is distributed in both the northern and southern temperate regions as well as the tropics and possesses high diversity with 580–620 species, extensive allopolyploidy, and a distinct cytogenetic evolutionary history (Ballard et al, 1998; Marcussen et al, 2012, 2015; Wahlert et al, 2014). Most Viola species, including V. pubescens, possess and evolutionarily successful yet genetically uncharacterized mixed breeding system of both chasmogamous and cleistogamous flowers. Culley and Klooster (2007) conducted a survey investigating the occurrence of the chasmogamous/cleistogamous mixed breeding system, reporting a total of 536 species encompassing 41 diverse plant families, with the most occurrences reported in Poaceae (grasses), Fabaceae (legumes), Violaceae (violets), and Orchidaceae (orchids). The widespread distribution of the chasmogamous/cleistogamous mixed breeding system among monocot and dicot families as well as its expansive geographic range, suggests that the breeding system has evolved many times through the angiosperms (Ballard et al, 2011). This broad distribution implies that the mixed breeding system is not a randomly occurring mating strategy and may be actively selected

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.