Fully sequencing the cassava full-length cDNA library reveals unannotated transcript structures and alternative splicing events in regions with a high density of single nucleotide variations, insertions-deletions, and heterozygous sequences.

Akihiro Ezoe,Anh Thu Vu,Maho Tanaka,Tetsuya Sakurai,Hiroki Tokunaga,Yoshinori Utsumi,Satoshi Iuchi,Satoshi Takahashi,Yukie Aso,Junko Ishida,Motoaki Seki,Manabu Ishitani

doi:10.1007/s11103-023-01346-4

Akihiro Ezoe, Anh Thu Vu + Show 10 more

https://doi.org/10.1007/s11103-023-01346-4

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

The primary transcript structure provides critical insights into protein diversity, transcriptional modification, and functions. Cassava transcript structures are highly diverse because of alternative splicing (AS) events and high heterozygosity. To precisely determine and characterize transcript structures, fully sequencing cloned transcripts is the most reliable method. However, cassava annotations were mainly determined according to fragmentation-based sequencing analyses (e.g., EST and short-read RNA-seq). In this study, we sequenced the cassava full-length cDNA library, which included rare transcripts. We obtained 8,628 non-redundant fully sequenced transcripts and detected 615 unannotated AS events and 421 unannotated loci. The different protein sequences resulting from the unannotated AS events tended to have diverse functional domains, implying that unannotated AS contributes to the truncation of functional domains. The unannotated loci tended to be derived from orphan genes, implying that the loci may be associated with cassava-specific traits. Unexpectedly, individual cassava transcripts were more likely to have multiple AS events than Arabidopsis transcripts, suggestive of the regulated interactions between cassava splicing-related complexes. We also observed that the unannotated loci and/or AS events were commonly in regions with abundant single nucleotide variations, insertions-deletions, and heterozygous sequences. These findings reflect the utility of completely sequenced FLcDNA clones for overcoming cassava-specific annotation-related problems to elucidate transcript structures. Our work provides researchers with transcript structural details that are useful for annotating highly diverse and unique transcripts and alternative splicing events.

Full Text