Genomic diversity in a pathogen population is the foundation for evolution and adaptations in virulence, drug resistance, pathogenesis, and immune evasion. Characterizing, analyzing, and understanding population-level diversity is also essential for epidemiological and forensic tracking of sources and revealing detailed pathways of transmission and spread. For bacteria, culturing, isolating, and sequencing the large number of individual colonies required to adequately sample diversity can be prohibitively time-consuming and expensive. While sequencing directly from a mixed population will show variants among reads, they cannot be linked to reveal allele combinations associated with particular traits or phylogenetic inheritance patterns. Here, we describe the theory and method of how population sequencing directly from a mixed sample can be used in conjunction with sequencing a very small number of colonies to describe the phylogenetic diversity of a population without haplotype reconstruction. To demonstrate the utility of population sequencing in capturing phylogenetic diversity, we compared isogenic clones to population sequences of Burkholderia pseudomallei from the sputum of a single patient. We also analyzed population sequences of Staphylococcus aureus derived from different people and different body sites. Sequencing results confirm our ability to capture and characterize phylogenetic diversity in our samples. Our analyses of B. pseudomallei populations led to the surprising discovery that the pathogen population is highly structured in sputum, suggesting that for some pathogens, sputum sampling may preserve structuring in the lungs and thus present a non-invasive alternative to understanding colonization, movement, and pathogen/host interactions. Our analyses of S. aureus samples show how comparing phylogenetic diversity across populations can reveal directionality of transmission between hosts and across body sites, demonstrating the power and utility for characterizing the spread of disease and identification of reservoirs at the finest levels. We anticipate that population sequencing and analysis can be broadly applied to accelerate research in a broad range of fields reliant on a foundational understanding of population diversity.
Read full abstract