Abstract

This research addresses the automated supply of parcels in logistics by employing a robotic arm to replace manual intervention. The process involves the suction-based retrieval of disorderedly stacked parcels and their placement onto a conveyor belt, presenting an economically significant engineering challenge. This paper designs a visual recognition system to assist the robotic arm in identifying the grasping surface of disorderedly stacked parcels. Leveraging transfer learning, a pre-trained model constructs a multimodal network framework. Pre-trained models of ResNet 169, DenseNet121, ResNet101, and ResNet50 serve as backbone networks to train and test on custom parcel datasets. Post comparative testing of model performance, DenseNet169 is chosen to construct the visual recognition network. Specifically, the RGB and Depth data of parcels are separately fed into DenseNet169 for feature extraction. Post-feature extraction, multimodal fusion is applied, integrating a lightweight attention mechanism, CBAM, to enhance semantic segmentation accuracy. Subsequently, post-processing of image features filters out the background, achieving precise identification of the parcel grasping region. Ultimately, the constructed network achieves an average TOP1 true class accuracy of 95.86% on the test set. Thus, the designed parcel visual recognition system based on this methodology exhibits robustness, meeting the demand for autonomous parcel retrieval by the robotic arm. It offers significant support towards resolving challenges in automated parcel supply within logistics.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call