Machine learning algorithms for dengue risk assessment: a case study for São Luís do Maranhão

Fernanda Paula Rocha,Mateus Giesbrecht

doi:10.1007/s40314-022-02101-z

Abstract

This study aims to assess dengue fever risk using Machine Learning techniques, such as logistic regressions, linear discriminant analyses, Naive Bayes, decision tree, and random forest classifiers. This kind of approach to epidemiological problems has been developed to detect risks for diseases occurrence and allows to create public policies based on mathematical models to prevent public health problems. In this study, the models were trained with data from the municipality of São Luís do Maranhão, state of Maranhão, Brazil. The majority of related works analyze states, countries, or continental levels, with greater availability of data. To apply the approach to such a small region, some oversampling techniques were used. The number of cases per neighborhood from 2014 to and 2020 and climatic, territorial, and environmental data was used as input variables to estimate the probability of dengue occurrence in the municipality. Due to the unbalanced database, we used the SMOTE, ADASYN, and DBSMOTE oversampling techniques. The DBSMOTE-trained Random Forest classifier achieved the best results with a 75.1% AUC, 75.43% sensitivity and a 60.53% specificity.

Full Text