Abstract

Wastewater Surveillance (WS) is a crucial tool in the management of COVID-19 pandemic. The surveillance is based on enumerating SARS-CoV-2 RNA concentrations in the community's sewage. In this study, we used WS data to develop a regression model for estimating the number of active COVID-19 cases on a university campus. Eight univariate and multivariate regression model types i.e. Linear Regression (LM), Polynomial Regression (PR), Generalised Additive Model (GAM), Locally Estimated Scatterplot Smoothing Regression (LOESS), K Nearest Neighbours Regression (KNN), Support Vector Regression (SVR), Artificial Neural Networks (ANN) and Random Forest (RF) were developed and compared. We found that the multivariate RF regression model, was the most appropriate for predicting the prevalence of COVID-19 infections at both a campus level and hostel-level. We also found that smoothing the normalised SARS-CoV-2 data and employing multivariate modelling, using student population as a second independent variable, significantly improved the performance of the models. The final RF campus level model showed good accuracy when tested using previously unseen data; correlation coefficient of 0.97 and a mean absolute error (MAE) of 20 %. In summary, our non-intrusive approach has the ability to complement projections based on clinical tests, facilitating timely follow-up and response.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call