Short sequence repeat mediated phase variation results in diverse phenotype presentation in many bacteria including Campylobacter and Neisseria species. Current methods for identifying the expression states of phase-variable genes involve taking a high number of single colonies. This approach is subject to bias, sampling effects and high workloads that reduce the ability to perform intermediary sampling. The use of high concentration colony sweeps provides a work around but reduces the resolution of combinatorial expression profiles (termed phasotypes). A parsimonious approach combining both single colony and sweep data was developed to overcome these limitations. The critical methodological advance is the use of an algorithm that utilises the experimental data from the two sample types and a parsimonious, iterative mathematical analysis that outputs the phasotype distribution with the highest likelihood of underpinning the experimental data sets. The advantages of this unified method are increased resolution and accuracy of gene expression state combinations as compared to conventional single colony sampling, reduced requirement for sampling large numbers of colonies leading to reduced costs, and a higher capacity for collecting samples and replicates.•Inputting of sweep and single colony data into an algorithm for a rapid determination of the combinatorial phase variation states (phasotypes) for repeat-mediated phase-variable bacterial genes•This method reduces the number of single colony samples required to produce accurate estimates of phasotypes•This method will reduce the costs of phasotype analyses and increase potential to analyse more time points or sample sites leading to an improved understanding of how phase variation contributes to bacterial host persistence and the ability to cause disease
Read full abstract