Abstract
In the age of next-generation sequencing (NGS), while data-driven methods such as genome-wide association study (GWAS) and machine learning (ML) excel at finding patterns, functional validation can be challenging due to the high numbers of candidate variants. We designed an integrative approach combining a GWAS on S. pneumoniae clinical isolates, followed by whole-genome transformation coupled with NGS to functionally characterize a large set of GWAS candidates. Our study validated several phenotypic folA mutations beyond the standard Ile100Leu mutation, and showed that the overexpression of the sulA locus produces trimethoprim (TMP) resistance in Streptococcus pneumoniae. These validated loci, when used to build ML models, were found to be the best inputs for predicting TMP minimal inhibitory concentrations. Integrative approaches can bridge the genotype-phenotype gap by biological insights that can be incorporated in ML models for accurate prediction of drug susceptibility.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have