After introducing pneumococcal conjugate vaccines (PCVs), serotype replacement occurred in Streptococcus pneumoniae. Predicting which pneumococcal strains will become common in carriage after vaccination can enhance vaccine design, public health interventions, and understanding of pneumococcal evolution. Invasive pneumococcal isolates were collected during 1998-2018 by the Active Bacterial Core surveillance (ABCs). Carriage data from Massachusetts (MA) and Southwest United States were used to calculate weights. Using pre-vaccine data, serotype-specific inverse-invasiveness weights were defined as the ratio of the proportion of the serotype in carriage to the proportion in invasive data. Genomic data were processed under bioinformatic pipelines to define genetically similar sequence clusters (i.e., strains), and accessory genes (COGs) present in 5-95% of isolates. Weights were applied to adjust observed strain proportions and COG frequencies. The negative frequency-dependent selection (NFDS) model predicted strain proportions by calculating the post-vaccine strain composition in the weighted invasive disease population that would best match pre-vaccine COG frequencies. Inverse-invasiveness weighting increased the correlation of COG frequencies between invasive and carriage data in linear or logit scale for pre-vaccine, post-PCV7, and post-PCV13; and between different epochs in the invasive data. Weighting the invasive data significantly improved the NFDS model's accuracy in predicting strain proportions in the carriage population in the post-PCV13 epoch, with the adjusted R2 increasing from 0.254 before weighting to 0.545 after weighting. The weighting system adjusted invasive disease data to better represent the pneumococcal carriage population, allowing the NFDS mechanism to predict strain proportions in carriage in the post-PCV13 epoch. Our methods enrich the value of genomic sequences from invasive disease surveillance.IMPORTANCEStreptococcus pneumoniae, a common colonizer in the human nasopharynx, can cause invasive diseases including pneumonia, bacteremia, and meningitis mostly in children under 5 years or older adults. The PCV7 was introduced in 2000 in the United States within the pediatric population to prevent disease and reduce deaths, followed by PCV13 in 2010, PCV15 in 2022, and PCV20 in 2023. After the removal of vaccine serotypes, the prevalence of carriage remained stable as the vacated pediatric ecological niche was filled with certain non-vaccine serotypes. Predicting which pneumococcal clones, and which serotypes, will be most successful in colonization after vaccination can enhance vaccine design and public health interventions, while also improving our understanding of pneumococcal evolution. While carriage data, which are collected from the pneumococcal population that is competing to colonize and transmit, are most directly relevant to evolutionary studies, invasive disease data are often more plentiful. Previously, evolutionary models based on negative frequency-dependent selection (NFDS) on the accessory genome were shown to predict which non-vaccine strains and serotypes were most successful in colonization following the introduction of PCV7. Here, we show that an inverse-invasiveness weighting system applied to invasive disease surveillance data allows the NFDS model to predict strain proportions in the projected carriage population in the post-PCV13/pre-PCV15 and pre-PCV20 epoch. The significance of our research lies in using a sample of invasive disease surveillance data to extend the use of NFDS as an evolutionary mechanism to predict post-PCV13 population dynamics. This has shown that we can correct for biased sampling that arises from differences in virulence and can enrich the value of genomic data from disease surveillance and advance our understanding of how NFDS impacts carriage population dynamics after both PCV7 and PCV13 vaccination.
Read full abstract