Abstract

In recent years, deep convolutional neural network have achieved remarkable performance in object detection and image classification. However, there are still some practical challenges in large-scale image recognition tasks. To be specific, the visual separability between different object categories is extremely uneven, and some categories have strong inter-class similarities. Existing CNN networks are trained as flat n-way classifiers, which is usually not sufficient to meet the challenges. Hence, we propose a framework: multi-task cascade deep convolutional neural network (MTCD-CNN), which contains two phases: object detection and hierarchical image classification, for large-scale commodity recognition. First, the object detection framework is utilized to locate and crop the areas that may contain objects. Then, hierarchical spectrum clustering is adopted to construct a category and a tree-like image classification model. During the testing phase, the indistinguishable objects are classified from coarse to fine by searching the path of the category tree. The proposed hierarchical image classification method provides an insight into the data by identifying the group of classes that are hard to classify and require more attention when compared to others. Through extensive experiments and comparative analyses of commodity detection in supermarkets and stores of Jinzhou city, the performance of MTCD-CNN has proved to be superior to other advanced methods, indicating that our proposed method has effectively solved the problem of excessive similarity of confusingly similar categories.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.