Abstract

To create a clean living environment, governments around the world have hired a large number of workers to clean up waste on pavements, which is inefficient for waste management. To better alleviate this problem, relevant scholars have proposed several deep learning methods based on RGB images to achieve waste detection and recognition. Considering the limitations of color images, we propose an efficient multi-modal learning solution for pavement waste detection and recognition. Specifically, we construct a high-quality outdoor pavement waste dataset called OPWaste, which is more in line with real needs. Compared to other waste datasets, OPWaste dataset not only has the advantages of rich background and high diversity, but also provides color and depth images. Meanwhile, we explore six different multi-modal fusion methods and propose a novel multi-modal multi-scale network (MM-Net) for RGB-D waste detection and recognition. MM-Net introduces a novel multi-scale refinement module (MRM) and multi-scale interaction module (MIM). MRM can effectively refine critical features using attention mechanisms. MIM can gradually realize information interaction between hierarchical features. In addition, we select several representative methods and perform comparative experiments. Experimental results show that MM-Net based on the image addition fusion method outperforms other deep learning models and reaches 97.3% and 84.4% on mAP0.5 and AR metrics. In fact, multi-modal learning plays an important role in intelligent waste recycling. As a promising auxiliary tool, our solution can be applied to intelligent cleaning robots for automatic outdoor waste management.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call