BackgroundA significant proportion of individuals with symptoms of sexually transmitted infection (STI) delay or avoid seeking healthcare, and digital diagnostic tools may prompt them to seek healthcare earlier. Unfortunately, none of the currently available tools fully mimic clinical assessment or cover a wide range of STIs.MethodsWe prospectively invited attendees presenting with STI-related symptoms at Melbourne Sexual Health Centre to answer gender-specific questionnaires covering the symptoms of 12 common STIs using a computer-assisted self-interviewing system between 2015 and 2018. Then, we developed an online symptom checker (iSpySTI.org) using Bayesian networks. In this study, various machine learning algorithms were trained and evaluated for their ability to predict these STI and anogenital conditions. We used the Z-test to compare their average area under the ROC curve (AUC) scores with the Bayesian networks for diagnostic accuracy.ResultsThe study population included 6,162 men (median age 30, IQR: 26–38; approximately 40% of whom had sex with men in the past 12 months) and 4,358 women (median age 27, IQR: 24–31). Non-gonococcal urethritis (NGU) (23.6%, 1447/6121), genital warts (11.7%, 718/6121) and balanitis (8.9%, 546/6121) were the most common conditions in men. Candidiasis (16.6%, 722/4538) and bacterial vaginosis (16.2%, 707/4538) were the most common conditions in women. During evaluation with unseen datasets, machine learning models performed well for most male conditions, with the AUC ranging from 0.81 to 0.95, except for urinary tract infections (UTI) (AUC 0.72). Similarly, the models achieved AUCs ranging from 0.75 to 0.95 for female conditions, except for cervicitis (AUC 0.58). Urethral discharge and other urinary symptoms were important features for predicting urethral gonorrhoea, NGU and UTIs. Similarly, participants selected skin images that were similar to their own lesions, and the location of the anogenital skin lesions were also strong predictors. The vaginal discharge (odour, colour) and itchiness were important predictors for bacterial vaginosis and candidiasis. The performance of the machine learning models was significantly better than Bayesian models for male balanitis, molluscum contagiosum and genital warts (P < 0.05) but was similar for the other conditions.ConclusionsBoth machine learning and Bayesian models could predict correct diagnoses with reasonable accuracy using prospectively collected data for 12 STIs and other common anogenital conditions. Further work should expand the number of anogenital conditions and seek ways to improve the accuracy, potentially using patient collected images to supplement questionnaire data.
Read full abstract