Abstract
The present work raises an investigation about prediction and the feature importance to estimate the COVID-19 infection, using Machine Learning approach. Our work analyzed the inclusion of climatic features, mobility, government actions and the number of cases per health sub-territory from an existing model. The Random Forest with Permutation Importance method was used to assess the importance and list the thirty most relevant that represent the probability of infection of the disease. Among all features, the most important were: i) the variables per region health stand out, ii) period comprised between the date of notification and symptom onset, iii) symptoms features as fever, cough and sore throat, iv) variables of the traffic flow and mobility, and also v) wheathers features. The model was validated and reached an accuracy average of 81.82%, whereas the sensitivity and specificity achieved 87.52% and the 78.67% respectively in the infection estimate. Therefore, the proposed investigation represents an alternative to guide authorities in understanding aspects related to the disease.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have