Abstract
Automatic food recognition systems have been receiving increasing attention in the research community with the advancements in inductive learning (e.g., classification in computer vision) due to their applicability in the healthcare and hospitality industry. However, food recognition is challenging due to its fine-grained nature and its high correlation with culture, geo-location, and language. To make food recognition systems feasible for the Middle Eastern region, we present a large-scale dataset (MEFood) of commonly consumed food items in the Middle East, thereby providing a dataset for current development and establishing a benchmark for future research. We have also thoroughly examined the MEFood dataset highlighting its challenging aspects and its real-world nature. Additionally, we have conducted a thorough experimental study benchmarking the mainstream computer vision and mobile networks on classification, runtime, and resource utilization metrics. Our results highlight that EfficientNet-V2 achieves performance closer to the best-performing individual model on the MEFood dataset while having the least resource utilization and minimal inference times. Finally, we have performed a thorough error analysis study to glean additional insights about the networks and MEFood dataset.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.