BackgroundIn 2021 and 2023, the World Health Organization approved RTS,S/AS01 and R21/Matrix M malaria vaccines, respectively, for routine immunization of children in African countries with moderate to high transmission. These vaccines are made of Plasmodium falciparum circumsporozoite protein (PfCSP), but polymorphisms in the gene raise concerns regarding strain-specific responses and the long-term efficacy of these vaccines. This study assessed the Pfcsp genetic diversity, population structure and signatures of selection among parasites from areas of different malaria transmission intensities in Mainland Tanzania, to generate baseline data before the introduction of the malaria vaccines in the country.MethodsThe analysis involved 589 whole genome sequences generated by and as part of the MalariaGEN Community Project. The samples were collected between 2013 and January 2015 from five regions of Mainland Tanzania: Morogoro and Tanga (Muheza) (moderate transmission areas), and Kagera (Muleba), Lindi (Nachingwea), and Kigoma (Ujiji) (high transmission areas). Wright’s inbreeding coefficient (Fws), Wright’s fixation index (FST), principal component analysis, nucleotide diversity, and Tajima’s D were used to assess within-host parasite diversity, population structure and natural selection.ResultsBased on Fws (< 0.95), there was high polyclonality (ranging from 69.23% in Nachingwea to 56.9% in Muheza). No population structure was detected in the Pfcsp gene in the five regions (mean FST = 0.0068). The average nucleotide diversity (π), nucleotide differentiation (K) and haplotype diversity (Hd) in the five regions were 4.19, 0.973 and 0.0035, respectively. The C-terminal region of Pfcsp showed high nucleotide diversity at Th2R and Th3R regions. Positive values for the Tajima’s D were observed in the Th2R and Th3R regions consistent with balancing selection. The Pfcsp C-terminal sequences revealed 50 different haplotypes (H_1 to H_50), with only 2% of sequences matching the 3D7 strain haplotype (H_50). Conversely, with the NF54 strain, the Pfcsp C-terminal sequences revealed 49 different haplotypes (H_1 to H_49), with only 0.4% of the sequences matching the NF54 strain (Hap_49).ConclusionsThe findings demonstrate high diversity of the Pfcsp gene with limited population differentiation. The Pfcsp gene showed positive Tajima’s D values, consistent with balancing selection for variants within Th2R and Th3R regions. The study observed differences between the intended haplotypes incorporated into the design of RTS,S and R21 vaccines and those present in natural parasite populations. Therefore, additional research is warranted, incorporating other regions and more recent data to comprehensively assess trends in genetic diversity within this important gene. Such insights will inform the choice of alleles to be included in the future vaccines.