Bees produce honey through the collection and transformation of nectar, whose botanical origin impacts the taste, nutritional value, and, therefore, the market price of the resulting honey. This phenomenon has led some to mislabel their honey so that it can be sold at a higher price. Metabolomics has been gaining popularity in food authentication, but rapid data mining algorithms are needed to facilitate the discovery of new authenticity markers. A nontargeted high-resolution liquid chromatography-mass spectrometry (HR/LC-MS) analysis of 262 monofloral honey samples, of which 50 were blueberry honey, was performed. Data mining methods were demonstrated for the discovery of binary single-markers (compound was only detected in blueberry honey), threshold single-markers (compound had the highest concentration in blueberry honey), and interval ratio-markers (the ratio of two compounds was within a unique interval in blueberry honey). A novel convolutional algorithm was developed for the discovery of interval ratio-markers, which trained 14× faster and achieved a 0.2 Matthews correlation coefficient (MCC) units higher classification score than existing open-source implementations. The convolutional algorithm also had classification performance similar to that of a brute-force search but trained 1521× faster. A pipeline for shortlisting candidate authenticity markers from the LC-MS spectra that may be suitable for chemical structure identification was also demonstrated and led to the identification of niacin as a blueberry honey threshold single-marker. This work demonstrates an end-to-end approach to mine the honey metabolome for novel authenticity markers and can readily be applied to other types of food and analytical chemistry instruments.
Read full abstract