Abstract

The European Food Safety Authority (EFSA) develops methodologies and tools for the detection of emerging risks in food and feed. This includes the identification of drivers of emerging risks, such as food frauds, which requires innovative approaches. In this study, an unsupervised machine learning technique called the Latent Dirichlet Allocation (LDA) topic model, was applied on a media corpus in the view of detecting rapidly specific food fraud incidents in the media, i.e. on the Europe Media Monitor Medical Information System (EMM/MEDISYS). LDA topic model can explore large collection of documents discovering the themes associated with the corpus and organize and summarize text documents identifying topics comprised in them, where a topic is defined as a pattern of words with their probability to belong to it. As a specific food fraud incident, beeswax adulteration was taken as an example. Beeswax can be adulterated for financial gain, and, although it is a product from apiculture, it might enter the food chain when it is introduced as honeycomb in honey pots. With the beeswax example, a total of 2276 news articles were retrieved on EMM/MEDISYS and classified into 10 topics showing different levels of relatedness to beeswax adulteration. A manual screening of all articles allowed to validate the classification made by the topic model. The topics that were found the most relevant contained indeed articles on beeswax adulteration incidents reported from official sources. In addition, those topics contained signals of potential emerging risks in the cosmetic and food wrapping sectors. The remaining topics highlighted the emergence of new beeswax market opportunities which supported the identified signals. It is concluded that the LDA topic model can be used to process rapidly information in the media, support the definition of more specific food fraud filters on EMM/MEDISYS and be of direct use for all stakeholders involved in the monitoring, assessment and management of food frauds.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call