Abstract

Long non-coding RNAs (lncRNAs) (>200 nt) are expressed at levels lower than those of the protein-coding mRNAs, and in all eukaryotic model species where they have been characterized, they are transcribed from thousands of different genomic loci. In humans, some four dozen lncRNAs have been studied in detail, and they have been shown to play important roles in transcriptional regulation, acting in conjunction with transcription factors and epigenetic marks to modulate the tissue-type specific programs of transcriptional gene activation and repression. In Schistosoma mansoni, around 10,000 lncRNAs have been identified in previous works. However, the limited number of RNA-sequencing (RNA-seq) libraries that had been previously assessed, together with the use of old and incomplete versions of the S. mansoni genome and protein-coding transcriptome annotations, have hampered the identification of all lncRNAs expressed in the parasite. Here we have used 633 publicly available S. mansoni RNA-seq libraries from whole worms at different stages (n = 121), from isolated tissues (n = 24), from cell-populations (n = 81), and from single-cells (n = 407). We have assembled a set of 16,583 lncRNA transcripts originated from 10,024 genes, of which 11,022 are novel S. mansoni lncRNA transcripts, whereas the remaining 5,561 transcripts comprise 120 lncRNAs that are identical to and 5,441 lncRNAs that have gene overlap with S. mansoni lncRNAs already reported in previous works. Most importantly, our more stringent assembly and filtering pipeline has identified and removed a set of 4,293 lncRNA transcripts from previous publications that were in fact derived from partially processed mRNAs with intron retention. We have used weighted gene co-expression network analyses and identified 15 different gene co-expression modules. Each parasite life-cycle stage has at least one highly correlated gene co-expression module, and each module is comprised of hundreds to thousands lncRNAs and mRNAs having correlated co-expression patterns at different stages. Inspection of the top most highly connected genes within the modules’ networks has shown that different lncRNAs are hub genes at different life-cycle stages, being among the most promising candidate lncRNAs to be further explored for functional characterization.

Highlights

  • Schistosomiasis is a neglected tropical disease, caused by flatworms from the genus Schistosoma, with estimates of more than 250 million infected people worldwide and responsible for 200 thousand deaths annually at the Sub-Saharan Africa (Who, 2015)

  • When the human genome was first sequenced, the vast genomic regions that lie between protein-coding genes were considered junk DNA; one decade later, the Encyclopedia of DNA Elements (ENCODE) project found that 80% of the human genome serves some biochemical purpose (Pennisi, 2012), including giving rise to the transcription of nearly 10,000 long non-coding RNAs (lncRNAs) (Derrien et al, 2012)

  • We are still at the beginning of the studies with lncRNAs, with the vast majority of their roles and mechanisms of action in human beings still unknown, it is clear that most of the lncRNAs are transcribed from intergenic regions and are key regulators in vital processes (Kitagawa et al, 2013; Rosa and Ballarino, 2016; Golicz et al, 2018), being associated to several pathologies in humans, such as cancer (Fang and Fullwood, 2016), Alzheimer’s (Zijian, 2016), and cardiac diseases (Simona et al, 2018)

Read more

Summary

Introduction

Schistosomiasis is a neglected tropical disease, caused by flatworms from the genus Schistosoma, with estimates of more than 250 million infected people worldwide and responsible for 200 thousand deaths annually at the Sub-Saharan Africa (Who, 2015). In America, it is estimated that 1 to 3 million people are infected by S. mansoni and over 25 million live in risk areas, being Brazil and Venezuela the most affected (Zoni et al, 2016). The prevalence of this disease is correlated to social–economic and environmental factors (Gomes Casavechia et al, 2018). This parasite has a very complex life-cycle comprised of several developmental stages, with a freshwater snail intermediate-host and a final mammalian host (Basch, 1976). A better understanding of the gene expression regulation mechanisms and of their components may lead to new therapeutic targets (Batugedara et al, 2017), and one key element could be the long non-coding RNAs (lncRNAs) (Blokhin et al, 2018)

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call