Abstract

To pursue a healthy lifestyle, people are increasingly concerned about their food ingredients. Recently, it has become a common practice to use an online recipe to select the ingredients that match an individual’s meal plan and healthy diet preference. The information from online recipes can be extracted and used to develop various food-related applications. Named entity recognition (NER) is often used to extract such information. However, the problem in building an NER system lies in the massive amount of data needed to train the classifier, especially on a specific domain, such as food. There are food NER datasets available, but they are still quite limited. Thus, we proposed an iterative self-training approach called semi-supervised multi-model prediction technique (SMPT) to construct a food ingredient NER dataset. SMPT is a deep ensemble learning model that employs the concept of self-training and uses multiple pre-trained language models in the iterative data labeling process, with a voting mechanism used as the final decision to determine the entity’s label. Utilizing the SMPT, we have created a new annotated dataset of ingredient entities obtained from the Allrecipes website named FINER. Finally, this study aims to use the FINER dataset as an alternative resource to support food computing research and development.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.