Abstract

Malaria, an infectious disease caused by Plasmodium parasites, still accounts for amounts of deaths annually in last decades. Despite the significance of Plasmodium falciparum as a model organism of malaria parasites, our understanding of gene expression of this parasite remains largely elusive since lots of progress on its genome and transcriptome are based on assembly with short sequencing reads. Herein, we report the new version of transcriptome dataset containing all full-length transcripts over the whole asexual blood stages by adopting a full-length sequencing approach with optimized experimental conditions of cDNA library preparation. We have identified a total of 393 alternative splicing (AS) events, 3,623 long non-coding RNAs (lncRNAs), 1,555 alternative polyadenylation (APA) events, 57 transcription factors (TF), 1,721 fusion transcripts in P. falciparum. Furthermore, the shotgun proteome was performed to validate the full-length transcriptome of P. falciparum. More importantly, integration of full-length transcriptomic and proteomic data identified 160 novel small proteins in lncRNA regions. Collectively, this full-length transcriptome dataset with high quality and accuracy and the shotgun proteome analyses shed light on the complex gene expression in malaria parasites and provide a valuable resource for related functional and mechanistic researches on P. falciparum genes.

Highlights

  • Malaria is still a major threat to public health globally caused by Plasmodium genus with the occurrence of artemisinin resistance (van der Pluijm et al, 2020)

  • Two approaches were applied into cDNA library construction and the transcriptome sequencing was performed on PacBio Sequel

  • Full-length RNA sequencing was used to reconstruct the transcriptome of P. falciparum

Read more

Summary

Introduction

Malaria is still a major threat to public health globally caused by Plasmodium genus with the occurrence of artemisinin resistance (van der Pluijm et al, 2020). Plasmodium, especially P. falciparum, is one of the deadliest pathogens that causes malaria in humans which is a disease transmitted by Anopheles mosquitoes. In the last two decades, second-generation sequencing approaches were widely used in genome and transcriptome sequencing which assisted us furtherly understanding the molecular mechanism and function of unknown genes. Sequences obtained by second-generation short reads assembly always lead to errors so that we could not obtain the full-length transcripts directly and characterize the gene structure accurately, such as the alternative splicing events. RNA-seq as a routine approach was widely used in research of gene discovery and biological functions. The full-length RNA-seq platform showed advantages in biological research, especially in gene structure identification, gradually taking the place of short-read RNA sequencing in transcriptome profiling

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call