Abstract

RNA sequencing (RNA-seq) has been a widely used high-throughput method to characterize transcriptomic dynamics spatiotemporally. However, RNA-seq data analysis pipelines typically depend on either a sequenced genome and/or corresponding reference transcripts. This limitation is a challenge for species lacking sequenced genomes and corresponding reference transcripts. The Nile rat (Arvicanthis niloticus) has two key features – it is daytime active, and it is prone to diet-induced diabetes, which makes it more similar to humans than regular laboratory rodents. However, at the time of this study, neither a Nile rat genome nor a reference transcript set were available, making it technically challenging to perform large-scale RNA-seq based transcriptomic studies. This genome-independent work progressed concurrently with our generation of a Nile rat genome. A well-annotated genome requires several iterations of manually reviewing curated transcripts and takes years to achieve. Here, we developed a Comparative RNA-Seq Pipeline (CRSP), integrating a comparative species strategy independent of a specific sequenced genome or species-matched reference transcripts. We performed benchmarking to validate that our CRSP tool can accurately quantify gene expression levels. In this study, we generated the first ultra-deep (2.3 billion × 2 paired-end) Nile rat RNA-seq data from 59 biopsy samples representing 22 major organs, providing a unique resource and spatial gene expression reference for Nile rat researchers. Importantly, CRSP is not limited to the Nile rat species and can be applied to any species without prior genomic knowledge. To facilitate a general use of CRSP, we also characterized the number of RNA-seq reads required for accurate estimation via simulation studies. CRSP and documents are available at: https://github.com/pjiang1105/CRSP.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call