Exploration of Data Fusion Strategies Using Principal Component Analysis and Multiple Factor Analysis

Mpho Mafata,Martin Kidd,Astrid Buica,Andrei Medvedovici,Jeanne Brand

doi:10.3390/beverages8040066

Mpho Mafata, Martin Kidd + Show 3 more

Open Access

https://doi.org/10.3390/beverages8040066

Copy DOI

Journal: Beverages	Publication Date: Oct 21, 2022
Citations: 4	License type: CC BY 4.0

Affiliation: Stellenbosch University, University of Bucharest

Abstract

In oenology, statistical analyses are used for descriptive purposes, mostly with separate sensory and chemistry data sets. Cases that combine them are mostly supervised, usually seeking to optimize discrimination, classification, or prediction power. Unsupervised methods are used as preliminary steps to achieving success in supervised models. However, there is potential for unsupervised methods to combine different data sets into comprehensive, information-rich models. This study detailed stepwise strategies for creating data fusion models using unsupervised techniques at different levels. Principal component analysis (PCA) and multiple factor analysis (MFA) were used to combine five data blocks (four chemistry and one sensory). The model efficiency and configurational similarity were evaluated using eigenvalues and regression vector (RV) coefficients, respectively. The MFA models were less efficient than PCA, having gradual distributions of eigenvalues across model dimensions. The MFA models were more representative than PCA, as indicated by high RV coefficients between MFA and each individual block. Therefore, MFA approaches were better suited for multi-modal data than PCA. This work approached data fusion systematically and showed the type of decisions that must be made and how to evaluate their consequences. Proper integration of data sets, instead of concatenation, is an important aspect to consider in multi-modal data fusion.

Full Text