Contaminated recyclables, which are frequently discarded as waste, pose a significant challenge to the implementation of a circular economy. These contaminated recyclables impede the circulation of resources, resulting in higher processing costs at material recovery facilities (MRFs). Over the past few decades, machine learning (ML) models such as linear regression (LR), support vector machine (SVM), and random forest (RF) have evolved to provide new methods for predicting inbound contamination rates in addition to traditional statistical models. In this study, we applied ML models to predict inbound contamination rates using demographic features from 15 counties in the U.S. with different curbside collection strategies. In general, we found that ML models outperformed linear mixed models. Specifically, SVM models had the highest performance (R2 = 0.75; mean absolute error (MAE) = 0.06), which may be due to their ability to model nonlinear relationships between features and inbound contamination rates. The key predictor was population, with poverty rate being positively correlated and median age negatively correlated with inbound contamination rates. To improve the management of contamination and enhance the implementation of a circular economy, better models are needed to understand and estimate inbound contamination rates as well as identify critical factors in the present and future.
Read full abstract