Comparing Machine Learning vs. Humans for Dietary Assessment

Maryam Abbasi,Filipe Cardoso,Pedro Martins,Cristina Wanzeller

doi:10.1007/978-3-031-14859-0_2

Abstract

AbstractDue to the availability of large-scale datasets (e.g., ImageNet, UECFood) and the advancement of deep Convolutional Neural Networks (CNN), computer vision image recognition has evolved dramatically. Currently, there are three major methods for using CNN: starting from scratch, using a pre-trained network off the shelf, and performing unsupervised pre-training with supervised changes. When it comes to those with dietary restrictions, automatic food detection and assessment are critical. In this research, we show how to address detection difficulties by combining three CNNs. The different CNN architectures are then assessed. The amount of parameters in the examined CNN models ranges from 5,000 to 160 million, depending on the number of layers. Second, the various CNNs under consideration are assessed based on dataset sizes and physical image context. The results are assessed in terms of performance vs. training time vs. accuracy. Finally, the accuracy of CNNs is investigated and examined using human knowledge and classification from the human visual system (HVS). Finally, additional categorization techniques, such as bag-of-words, are considered to solve this problem. Based on the findings, it can be concluded that the HVS is more accurate when a data set comprises a wide range of variables. When the dataset is restricted to niche photos, the CNN outperforms the HVS.KeywordsCNNGoogLeNetInceptionResNetDietary

Full Text