Abstract

The key part of unmanned retail is how to judge who bought which product. To solve the problems, we propose a method that combines human pose estimation and commodity detection algorithm based on deep learning. To process videos in real-time, we apply depth separable convolution to modify the human pose estimation algorithm, reduce the size of convolution kernel, and fuse multi-stage information. To detect commodities, we construct a commodity detection dataset to train the object detection model. The modified pose estimation algorithm is used to identify key points of the left and right wrists of the human body, and the object detection algorithm recognizes products existing in the current image, we calculate the distance between key points and products, using this value to determine whether the customer has purchased a certain product. As a result, we tested the proposed pipeline in real scenarios, it can determine whether the user purchased the product. This method which only uses computer vision for judgment is convenient to deploy and further development of unmanned retail. Research results have been applied to brick-and-mortar stores.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call