Prediction of Pseudomonas spp. Population in Food Products and Culture Media Using Machine Learning-Based Regression Methods.

Fatih Tarlak,Özgün Yücel

doi:10.3390/life13071430

Fatih Tarlak, Özgün Yücel

Open Access

https://doi.org/10.3390/life13071430

Copy DOI

Abstract

Machine learning approaches are alternative modelling techniques to traditional modelling equations used in predictive food microbiology and utilise algorithms to analyse large datasets that contain information about microbial growth or survival in various food matrices. These approaches leverage the power of algorithms to extract insights from the data and make predictions regarding the behaviour of microorganisms in different food environments. The objective of this study was to apply various machine learning-based regression methods, including support vector regression (SVR), Gaussian process regression (GPR), decision tree regression (DTR), and random forest regression (RFR), to estimate bacterial populations. In order to achieve this, a total of 5618 data points for Pseudomonas spp. present in food products (beef, pork, and poultry) and culture media were gathered from the ComBase database. The machine learning algorithms were applied to predict the growth or survival behaviour of Pseudomonas spp. in food products and culture media by considering predictor variables such as temperature, salt concentration, water activity, and acidity. The suitability of the algorithms was assessed using statistical measures such as coefficient of determination (R2), root mean square error (RMSE), bias factor (Bf), and accuracy (Af). Each of the regression algorithms showed appropriate estimation capabilities with R2 ranging from 0.886 to 0.913, RMSE from 0.724 to 0.899, Bf from 1.012 to 1.020, and Af from 1.086 to 1.101 for each food product and culture medium. Since the predictive capability of RFR was the best among the algorithms, externally collected data from the literature were used for RFR. The external validation process showed statistical indices of Bf ranging from 0.951 to 1.040 and Af ranging from 1.091 to 1.130, indicating that RFR can be used for predicting the survival and growth of microorganisms in food products. Therefore, machine learning approaches can be considered as an alternative to conventional modelling methods in predictive microbiology. However, it is important to highlight that the prediction power of the machine learning regression method directly depends on the dataset size, and it requires a large dataset to be employed for modelling. Therefore, the modelling work of this study can only be used for the prediction of Pseudomonas spp. in specific food products (beef, pork, and poultry) and culture medium with certain conditions where a large dataset is available.

Full Text