Abstract

BackgroundPlasmodium falciparum is the main causative agent of malaria. Of the 5 484 predicted genes of P. falciparum, about 57% do not have sufficient sequence similarity to characterized genes in other species to warrant functional assignments. Non-homology methods are thus needed to obtain functional clues for these uncharacterized genes. Gene expression data have been widely used in the recent years to help functional annotation in an intra-species way via the so-called Guilt By Association (GBA) principle.ResultsWe propose a new method that uses gene expression data to assess inter-species annotation transfers. Our approach starts from a set of likely orthologs between a reference species (here S. cerevisiae and D. melanogaster) and a query species (P. falciparum). It aims at identifying clusters of coexpressed genes in the query species whose coexpression has been conserved in the reference species. These conserved clusters of coexpressed genes are then used to assess annotation transfers between genes with low sequence similarity, enabling reliable transfers of annotations from the reference to the query species. The approach was used with transcriptomic data sets of P. falciparum, S. cerevisiae and D. melanogaster, and enabled us to propose with high confidence new/refined annotations for several dozens hypothetical/putative P. falciparum genes. Notably, we revised the annotation of genes involved in ribosomal proteins and ribosome biogenesis and assembly, thus highlighting several potential drug targets.ConclusionsOur approach uses both sequence similarity and gene expression data to help inter-species gene annotation transfers. Experiments show that this strategy improves the accuracy achieved when using solely sequence similarity and outperforms the accuracy of the GBA approach. In addition, our experiments with P. falciparum show that it can infer a function for numerous hypothetical genes.

Highlights

  • Plasmodium falciparum is the main causative agent of malaria

  • We focused on this biological process as a case study, and attempted to uncover some related malarial genes, by paying special attention to the Bozdech data sets in which the cell cycle was synchronized. 38 genes were proposed to be involved in cell cycle regulation by Tienda-Luna et al (2008) [42], and 97 P. falciparum kinase genes were reported by Ward et al (2004) [43] as representative of the malarial kinome

  • Our approach searches for conserved coexpression between a query species and a reference species, and uses this information to increase confidence of annotation transfers between genes with borderline sequence similarity

Read more

Summary

Introduction

Plasmodium falciparum is the main causative agent of malaria. Of the 5 484 predicted genes of P. falciparum, about 57% do not have sufficient sequence similarity to characterized genes in other species to warrant functional assignments. Non-homology methods are needed to obtain functional clues for these uncharacterized genes. Malaria is due to infections by protozoan parasites of the Plasmodium genus, transmitted by bites of female Anopheles mosquitoes. Of the four species that infect humans, P. falciparum causes the greatest incidence of illness and death [1]. Despite sustained efforts to combat the disease, safe and affordable new drugs, and new drug genus or even the Apicomplexan phylum to which these organisms belong [5], it is certainly further exacerbated by the high evolutionary distance between P. falciparum and other sequenced organisms [6], which makes homology detection difficult. P. falciparum is a typical organism for which new approaches are needed to help detection of distant homologs

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.