A likelihood ratio model for the determination of the geographical origin of olive oil

Patryk Własiuk,Grzegorz Zadora,Agnieszka Martyna

doi:10.1016/j.aca.2014.10.022

Abstract

Food fraud or food adulteration may be of forensic interest for instance in the case of suspected deliberate mislabeling. On account of its potential health benefits and nutritional qualities, geographical origin determination of olive oil might be of special interest. The use of a likelihood ratio (LR) model has certain advantages in contrast to typical chemometric methods because the LR model takes into account the information about the sample rarity in a relevant population. Such properties are of particular interest to forensic scientists and therefore it has been the aim of this study to examine the issue of olive oil classification with the use of different LR models and their pertinence under selected data pre-processing methods (logarithm based data transformations) and feature selection technique. This was carried out on data describing 572 Italian olive oil samples characterised by the content of 8 fatty acids in the lipid fraction. Three classification problems related to three regions of Italy (South, North and Sardinia) have been considered with the use of LR models. The correct classification rate and empirical cross entropy were taken into account as a measure of performance of each model. The application of LR models in determining the geographical origin of olive oil has proven to be satisfactorily useful for the considered issues analysed in terms of many variants of data pre-processing since the rates of correct classifications were close to 100% and considerable reduction of information loss was observed. The work also presents a comparative study of the performance of the linear discriminant analysis in considered classification problems. An approach to the choice of the value of the smoothing parameter is highlighted for the kernel density estimation based LR models as well.

Full Text