Abstract

Circular RNAs (circRNAs), covalently closed continuous RNA loops, are generated from cognate linear RNAs through back splicing events, and alternative splicing events may generate different circRNA isoforms at the same locus. However, the challenges of reconstruction and quantification of alternatively spliced full-length circRNAs remain unresolved. On the basis of the internal structural characteristics of circRNAs, we developed CircAST, a tool to assemble alternatively spliced circRNA transcripts and estimate their expression by using multiple splice graphs. Simulation studies showed that CircAST correctly assembled the full sequences of circRNAs with a sensitivity of 85.63%–94.32% and a precision of 81.96%–87.55%. By assigning reads to specific isoforms, CircAST quantified the expression of circRNA isoforms with correlation coefficients of 0.85–0.99 between theoretical and estimated values. We evaluated CircAST on an in-house mouse testis RNA-seq dataset with RNase R treatment for enriching circRNAs and identified 380 circRNAs with full-length sequences different from those of their corresponding cognate linear RNAs. RT-PCR and Sanger sequencing analyses validated 32 out of 37 randomly selected isoforms, thus further indicating the good performance of CircAST, especially for isoforms with low abundance. We also applied CircAST to published experimental data and observed substantial diversity in circular transcripts across samples, thus suggesting that circRNA expression is highly regulated. CircAST can be accessed freely at https://github.com/xiaofengsong/CircAST.

Highlights

  • Circular RNAs are covalently closed continuous RNA loops generated through back splicing events [1,2,3,4]

  • These spliced reads, along with the information of exon boundaries provided in the gene annotation file, are used to construct splice graphs for all the back splicing events detected by upstream circRNA identification tools, such as UROBORUS, CIRCexplorer2, or CIRI2, in a gene locus

  • Previous studies have suggested that RNase R can remove linear RNAs and enrich both circular RNAs and intron lariats [33]

Read more

Summary

Introduction

Circular RNAs (circRNAs) are covalently closed continuous RNA loops generated through back splicing events [1,2,3,4]. Accurate annotations based on the full-length sequences of circRNAs, such as miRNA sponge, protein binding sites, or protein-coding potential, are important for functional studies [6,7,8,9,10,11,12]. CircRNAs may have multiple isoforms with different full-length sequences produced by alternative splicing (AS) events. A recent study has shown that the sequence of the circRNA at the FBXW7 locus is different from that of its cognate linear form and has revealed that the 185-aa protein encoded by circFBXW7 inhibits cancer cell proliferation [9]. The lack of accurate full-length sequence information for circRNA poses significant challenges in functional studies. Using the assembled circRNA isoforms with incorrect or incomplete sequences may result in erroneous conclusions in functional studies [9].

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call