Abstract

BackgroundTrypanosoma cruzi, the causal agent of Chagas Disease, affects more than 16 million people in Latin America. The clinical outcome of the disease results from a complex interplay between environmental factors and the genetic background of both the human host and the parasite. However, knowledge of the genetic diversity of the parasite, is currently limited to a number of highly studied loci. The availability of a number of genomes from different evolutionary lineages of T. cruzi provides an unprecedented opportunity to look at the genetic diversity of the parasite at a genomic scale.ResultsUsing a bioinformatic strategy, we have clustered T. cruzi sequence data available in the public domain and obtained multiple sequence alignments in which one or two alleles from the reference CL-Brener were included. These data covers 4 major evolutionary lineages (DTUs): TcI, TcII, TcIII, and the hybrid TcVI. Using these set of alignments we have identified 288,957 high quality single nucleotide polymorphisms and 1,480 indels. In a reduced re-sequencing study we were able to validate ~ 97% of high-quality SNPs identified in 47 loci. Analysis of how these changes affect encoded protein products showed a 0.77 ratio of synonymous to non-synonymous changes in the T. cruzi genome. We observed 113 changes that introduce or remove a stop codon, some causing significant functional changes, and a number of tri-allelic and tetra-allelic SNPs that could be exploited in strain typing assays. Based on an analysis of the observed nucleotide diversity we show that the T. cruzi genome contains a core set of genes that are under apparent purifying selection. Interestingly, orthologs of known druggable targets show statistically significant lower nucleotide diversity values.ConclusionsThis study provides the first look at the genetic diversity of T. cruzi at a genomic scale. The analysis covers an estimated ~ 60% of the genetic diversity present in the population, providing an essential resource for future studies on the development of new drugs and diagnostics, for Chagas Disease. These data is available through the TcSNP database (http://snps.tcruzi.org).

Highlights

  • Trypanosoma cruzi, the causal agent of Chagas Disease, affects more than 16 million people in Latin America

  • In this work we present an initial compilation of a genome-wide map of genetic diversity in T. cruzi, and its functional analysis, focussed mostly on protein-coding regions of the genome

  • Because of the repetitive nature of the T. cruzi genome [24,33], we decided to focus this initial effort on mapping the genetic diversity in mostly single copy protein coding loci

Read more

Summary

Introduction

Trypanosoma cruzi, the causal agent of Chagas Disease, affects more than 16 million people in Latin America. Some markers allow the distinction of two or three major lineages [11,12,13,14], while other experimental strategies, such as RAPD and multilocus isoenzyme electrophoresis (MLEE) support the distinction of six subdivisions [15,16,17] originally designated as DTUs I, IIa, IIb, IIc, IId, and IIe [16]. This nomenclature was revised as follows: TcI, TcII (former TcIIb), TcIII (IIc), TcIV (TcIIa), TcV (TcIId) and TcVI (TcIIe) [18,19]. The currently favoured hypothesis suggests that these two lineages originated after either one or two independent hybridization events between strains of DTUs TcII and TcIII [21,22,23]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call