Abstract

Short tandem repeats (STRs) are highly informative genetic markers that have been used extensively in population genetics analysis. They are an important source of genetic diversity and can also have functional impact. Despite the availability of bioinformatic methods that permit large-scale genome-wide genotyping of STRs from whole genome sequencing data, they have not previously been applied to sequencing data from large collections of malaria parasite field samples. Here, we have genotyped STRs using HipSTR in more than 3,000 Plasmodium falciparum and 174 Plasmodium vivax published whole-genome sequence data from samples collected across the globe. High levels of noise and variability in the resultant callset necessitated the development of a novel method for quality control of STR genotype calls. A set of high-quality STR loci (6,768 from P. falciparum and 3,496 from P. vivax) were used to study Plasmodium genetic diversity, population structures and genomic signatures of selection and these were compared to genome-wide single nucleotide polymorphism (SNP) genotyping data. In addition, the genome-wide information about genetic variation and other characteristics of STRs in P. falciparum and P. vivax have been available in an interactive web-based R Shiny application PlasmoSTR (https://github.com/bahlolab/PlasmoSTR).

Highlights

  • Short tandem repeats (STRs), known as microsatellites, are tandem nucleotide repeats (1– 9 base pairs, bp) that are both abundant throughout the genome and highly polymorphic

  • We use a set of genome-wide high-quality STRs to study parasite population genetics and compare them to genome-wide single nucleotide polymorphism (SNP) genotyping data, revealing both high consistency with SNP based signals, as well as identifying some signals unique to the STR marker data. These results demonstrate that the identification of highly informative STR markers from large numbers of population samples is a powerful approach to study the genetic diversity, population structures and genomic signatures of selection in P. falciparum and P. vivax

  • Sample size varied by population with South America (SAM) = 31, West Africa (WAF) = 959, Central Africa (CAF) = 100, East Africa (EAF) = 327, South Asia (SAS) = 32, the western part of Southeast Asia (WSEA) = 690, the eastern part of Southeast Asia (ESEA) = 827 and Oceania (OCE) = 81 [21]

Read more

Summary

Introduction

Short tandem repeats (STRs), known as microsatellites, are tandem nucleotide repeats (1– 9 base pairs, bp) that are both abundant throughout the genome and highly polymorphic. In organisms with AT content close to 50%, such as Drosophila or humans, STRs only account for 1–3% of the genome [3,6,7]. These repetitive sequences can arise, expand or contract rapidly. STRs in coding regions with a motif size that is a multiple of three (e.g. trinucleotide or hexanucleotide repeats) will not result in a frame-shift mutation when repeats are deleted or added, but can change protein sequences [10]. The Pfnhe-1 protein contains a polymorphic amino acid motif DNNND (GATAACAATAATGAT) and DDNHNDNHNND (GATGATAACCATAATGATAATCATAATAATGAT) which affects the P. falciparum Na+/H+ exchanger capabilities, and influences quinine resistance by combining Pfcrt and Pfmdr1 [11,12]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call