Deep Learning Computer Vision for Sorting and Size Determination of Municipal Waste

Daniel Octavian Melinte,Dan Dumitriu,Paul-Nicolae Ancuţa,Mihai Mărgăritescu

doi:10.1007/978-3-030-26991-3_14

Abstract

This paper presents a mobile robotic system for picking and collecting waste and trash from the ground. The objects that the trash is made up are detected using a camera mounted on the robotic system and are processed for identification using computer vision and deep neural networks. In order to successfully pick waste and trash, the size and distance to the object are important. The proposed method takes into account that the focal length of the camera is constant, and consists of two phases: (a) from the first captured image one can measure the object dimensions l1 and w1 in pixels, (b) the camera moves towards the object by a distance of Δd = 100 mm capturing the second image, where l2 and w2 are the measured object dimensions in pixels. For better results, both captured images should be taken statically (the robot is not moving). The algorithm computes the distance to object as a function of Δd, l1 and l2, i.e., the dimensions in pixels measured on the two different/consecutive captured images (with d2 = d1 − Δd [mm]). Then, the object dimensions are easily determined as a function of l1 and l2 (or w1 and w2), of d1 and of the camera focal length fpx (in pixels). This determination of object size and distance to object is a simple but reliable method, showing good performance in practice. The image processing uses pre-trained deep convolutional networks with Single Shot Detectors (SSD) and MobileNetsV1. MobileNetsV1 architecture consists of 28 convolutional layers, one aggregation layer, and one fully connected layer. All 28 convolutional layers are followed by a non-linear function (RELU). The transition from linear convolutional to non-linear layers is done using a normalization function. After the convolutions are performed, the aggregation takes place, followed by the classification through a fully connected layer.

Full Text