Abstract
Sequencing and epigenetic profiling of target genes in plants are important tasks with various applications ranging from marker design for plant breeding to the study of gene expression regulation. This is particularly interesting for plants with big genome size for which whole-genome sequencing can be time-consuming and costly. In this study, we asked whether recently proposed Cas9-targeted nanopore sequencing (nCATS) is efficient for target gene sequencing for plant species with big genome size. We applied nCATS to sequence the full-length glutenin genes (Glu-1Ax, Glu-1Bx and Glu-1By) and their promoters in hexaploid triticale (X Triticosecale, AABBRR, genome size is 24 Gb). We showed that while the target gene enrichment per se was quite high for the three glutenin genes (up to 645×), the sequencing depth that was achieved from two MinION flowcells was relatively low (5–17×). However, this sequencing depth was sufficient for various tasks including detection of InDels and single-nucleotide variations (SNPs), read phasing and methylation profiling. Using nCATS, we uncovered SNP and InDel variation of full-length glutenin genes providing useful information for marker design and deciphering of variation of individual Glu-1By alleles. Moreover, we demonstrated that glutenin genes possess a ‘gene-body’ methylation epigenetic profile with hypermethylated CDS part and hypomethylated promoter region. The obtained information raised an interesting question on the role of gene-body methylation in glutenin gene expression regulation. Taken together, our work disclosures the potential of the nCATS approach for sequencing of target genes in plants with big genome size.
Highlights
Target gene sequencing (TGSeq) is a set of different approaches for sequencing of specific genes without the application of whole-genome sequencing (WGS) which is an expensive alternative
It may provide a foundation for the study of epigenetic control of spatiotemporal gene expression patterns, a poorly studied field especially in plants with big and complex genomes such as wheat and triticale
Using triticale and glutenin genes as targets we demonstrated that nCATS is a useful method low sequencing depth should be expected, and more flow cells are required
Summary
Target gene sequencing (TGSeq) is a set of different approaches for sequencing of specific genes without the application of whole-genome sequencing (WGS) which is an expensive alternative. (allo)polyploidy adds another layer of complexity for sequencing individual genes and interpreting the results. Several approaches have been used for TGSeq including Sanger sequencing, target gene enrichment strategies with subsequent short-read sequencing and long-read sequencing. Sanger sequencing of target genes is a method of choice for end-to-end sequencing of short genes (below 1 Kb) while sequencing of longer genes requires amplification of a set of overlapping fragments. Several short-read based techniques have been developed and successfully used for TGSeq (reviewed by [1]). Short-read sequencing suffers from mapping issues to repeating and low complexity regions as well as assembly errors if de novo gene assembly is used
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.