Estimating the Category of Districts in a State Based on COVID Test Positivity Rate (TPR): A Study Using Supervised Machine Learning Approach

Sourav Kumar Bhoi,Chittaranjan Mallick,Kalyan Kumar Jena,Rajendra Prasad Nayak,Debasis Mohapatra

doi:10.1007/978-981-19-1018-0_40

Abstract

AbstractIn this paper, the districts of a state are categorized into Category A, Category B, and Category C based on COVID TPR using supervised machine learning approaches. As per the report published by WHO, the TPR should be less than 5% by which we can say that the infection is under control in the locality. TPR is the number of COVID tests found positive to the number of COVID tests performed. Currently, government of all states are taking decisions based on COVID TPR of a district whether it has limited restrictions (Category A-low spread) or partial lockdown (Category B-moderate spread) or complete lockdown (Category C-high spread). In this work, a synthetic dataset is generated by considering the WHO guideline by taking TPR values from 0 TO 5% for Category A, > 5% − ≤10% for Category B, and > 10% for Category C. Then, this input data is fed into the supervised machine learning models such as decision tree (DT), neural network (NN), k-nearest neighbour (k-NN), and support vector machine (SVM) for training to find the best machine learning (ML) model with high classification accuracy for prediction of the Categories (A/B/C). Afterwards, the testing data (TPR value) is generated using random distribution function for 100 districts, and this testing data is fed into the ML models to estimate the category under which the district exist. The analysis of the above methods is performed using Orange 3.29.3 data analytics tool. From the results, it is observed that DT performs better in predicting the category of the districts with high probability.KeywordsCOVIDTest positivity rateCategory of districtsMachine learningSupervisedPrediction

Full Text