Abstract

Food recognition plays a much critical role in various health-care applications. However, it poses many challenges to current approaches due to the diverse appearances of food dishes and the non-uniform composition of ingredients for the foods in the same category. Current methods primarily focus on the appearance of foods without considering their semantic information, easily finding the wrong attention areas of food images. Second, these methods lack the dynamic weighting of multiple semantic features in the modeling process. Thus this paper proposes a novel Multi-View Attention Network within the multi-task learning framework that incorporates multiple semantic features into the food recognition task from both ingredient recognition and recipe modeling. It also utilizes the multi-view attention mechanism to automatically adjust the weights of different semantic features and enables different tasks to interact with each other so as to obtain a more comprehensive feature representation. The experiments conducted on both ChineseFoodNet and VIREO Food-172 benchmark databases validate the proposed method with the obvious improvement of the performance and the lower parameter size.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.