Abstract

BackgroundApicomplexan parasites are causative agents of various diseases including malaria and have been targets of extensive genomic sequencing. We generated 5'-EST collections for six apicomplexa parasites using our full-length oligo-capping cDNA library method. To improve upon the current genome annotations, as well as to validate the importance for physical cDNA clone resources, we generated a large-scale collection of full-length cDNAs for several apicomplexa parasites.ResultsIn this study, we used a total of 61,056 5'-end-single-pass cDNA sequences from Plasmodium falciparum, P. vivax, P. yoelii, P. berghei, Cryptosporidium parvum, and Toxoplasma gondii. We compared these partially sequenced cDNA sequences with the currently annotated gene models and observed significant inconsistencies between the two datasets. In particular, we found that on average 14% of the exons in the current gene models were not supported by any cDNA evidence, and that 16% of the current gene models may contain at least one mis-annotation and should be re-evaluated. We also identified a large number of transcripts that had been previously unidentified. For 732 cDNAs in T. gondii, the entire sequences were determined in order to evaluate the annotated gene models at the complete full-length transcript level. We found that 41% of the T. gondii gene models contained at least one inconsistency. We also identified and confirmed by RT-PCR 140 previously unidentified transcripts found in the intergenic regions of the current gene annotations. We show that the majority of these discrepancies are due to questionable predictions of one or two extra exons in the upstream or downstream regions of the genes.ConclusionOur data indicates that the current gene models are likely to still be incomplete and have much room for improvement. Our unique full-length cDNA information is especially useful for further refinement of the annotations for the genomes of apicomplexa parasites.

Highlights

  • Apicomplexan parasites are causative agents of various diseases including malaria and have been targets of extensive genomic sequencing

  • We found that some complementary DNA(s) (cDNA) clusters (16 cases in Plasmodium falciparum (Pf)) corresponded to more than one annotated gene, with the two adjacent annotated genes likely representing erroneously separated genes that should be reannotated as one gene (Figure 1b)

  • We observed that in Plasmodium berghei (Pb), 27% of the protein coding regions of the annotated genes did not start from an ATG site, so their coding sequence (CDS) are intrinsically incorrect. When we evaluated these annotations in more detail, we sometimes found that two neighboring annotated genes in Plasmodium yoelii (Py) or Pb that mapped onto different contigs were only, correctly, represented as one gene in Pf, leading to more genes having been annotated in Py and Pb than in Pf

Read more

Summary

Introduction

Apicomplexan parasites are causative agents of various diseases including malaria and have been targets of extensive genomic sequencing. We generated 5'-EST collections for six apicomplexa parasites using our full-length oligo-capping cDNA library method. To improve upon the current genome annotations, as well as to validate the importance for physical cDNA clone resources, we generated a large-scale collection of full-length cDNAs for several apicomplexa parasites. Apicomplexa is a phylum of protozoan parasites that infects both humans and animals, causing serious health problems world-wide. The resulting genomic sequences have been analyzed, revealing that even though the apicomplexan parasites are believed to have been derived from a common ancestor, their genome sizes and compositions vary widely. Respective genome information for each of these species has been made publicly available in one or more of the following databases: PlasmoDB [12,13,14], CryptoDB [15,16,17], ToxoDB [18,19], EuPathDB [20], and GeneDB [21]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call