CafeteriaFCD Corpus: Food Consumption Data Annotated with Regard to Different Food Semantic Resources.

Gordana Ispirova,Barbara Koroušić Seljak,Eva Valenčič,Peter Korošec,Ermanno Cavalli,Matevž Ogrinc,Riste Stojanov,Tome Eftimov,Gjorgjina Cenikj

doi:10.3390/foods11172684

Abstract

Besides the numerous studies in the last decade involving food and nutrition data, this domain remains low resourced. Annotated corpuses are very useful tools for researchers and experts of the domain in question, as well as for data scientists for analysis. In this paper, we present the annotation process of food consumption data (recipes) with semantic tags from different semantic resources—Hansard taxonomy, FoodOn ontology, SNOMED CT terminology and the FoodEx2 classification system. FoodBase is an annotated corpus of food entities—recipes—which includes a curated version of 1000 instances, considered a gold standard. In this study, we use the curated version of FoodBase and two different approaches for annotating—the NCBO annotator (for the FoodOn and SNOMED CT annotations) and the semi-automatic StandFood method (for the FoodEx2 annotations). The end result is a new version of the golden standard of the FoodBase corpus, called the CafeteriaFCD (Cafeteria Food Consumption Data) corpus. This corpus contains food consumption data—recipes—annotated with semantic tags from the aforementioned four different external semantic resources. With these annotations, data interoperability is achieved between five semantic resources from different domains. This resource can be further utilized for developing and training different information extraction pipelines using state-of-the-art NLP approaches for tracing knowledge about food safety applications.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Foods	Publication Date: Sep 2, 2022
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

CafeteriaFCD Corpus: Food Consumption Data Annotated with Regard to Different Food Semantic Resources.

Abstract

Talk to us

Similar Papers

More From: Foods

Lead the way for us

Similar Papers

A Fine-Tuned Bidirectional Encoder Representations From Transformers Model for Food Named-Entity Recognition: Algorithm Development and Validation.
Riste Stojanov ... Gorjan Popovski
Journal of Medical Internet Research | VOL. 23
Riste Stojanov, et. al.Riste Stojanov ... Gorjan Popovski
09 Aug 2021
Journal of Medical Internet Research | VOL. 23

Mismatches between major subhierarchies and semantic tags in SNOMED CT
Jonathan P Bona ... Werner Ceusters
Journal of Biomedical Informatics | VOL. 81
Jonathan P Bona, et. al.Jonathan P Bona ... Werner Ceusters
17 Feb 2018
Journal of Biomedical Informatics | VOL. 81

Definition of a SNOMED CT pathology subset and microglossary, based on 1.17 million biological samples from the Catalan Pathology Registry
Xavier Sanz ... Josepa Ribes
Journal of Biomedical Informatics | VOL. 78
Xavier Sanz, et. al.Xavier Sanz ... Josepa Ribes
20 Nov 2017
Journal of Biomedical Informatics | VOL. 78

How cancer registries can detect neoplasms in pathology laboratories that code with SNOMED CT terminology? An actual, simple and flexible solution
Xavier Sanz ... Josepa Ribes
International Journal of Medical Informatics | VOL. 141
Xavier Sanz, et. al.Xavier Sanz ... Josepa Ribes
11 May 2020
International Journal of Medical Informatics | VOL. 141

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CafeteriaFCD Corpus: Food Consumption Data Annotated with Regard to Different Food Semantic Resources.

Abstract

Talk to us

Similar Papers

More From: Foods