Abstract

Image-based food pattern classification poses new challenges for mainstream computer vision algorithms. Recent works on feature fusion technique have significantly boosted the generalization performances of food categorization tasks. However, the use of representation learning in the training process of feature fusion has rarely been explored. This study addresses the issue through a new supervised subnetwork-based feature encoding and pattern classification model, termed a wide hierarchical subnetwork-based neural network (Wi-HSNN). In particular, Wi-HSNN is a subnet-based iterative training process in which one pair of subnets is added to the framework in each iteration. Furthermore, instead of learning the optimal representations with the whole dataset, this paper introduces a batch-by-batch parallel scheme of Wi-HSNN to process large-scale datasets, such as Place365 set with more than 1.8 million samples. Extensive evaluations on eight benchmark datasets from food classification to scene image recognition demonstrated that the proposed solution has better representation learning capacity compared to existing encoding methods, and achieves stronger performance than existing approaches for food image classification tasks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call