Abstract

An important issue in modern science remains the integration of multiple data sources, considered as datasets divided into blocks of variables measured on the same set of observations. When analyzing these blocks, the path modeling approach aims to take into account a specific pattern of directed relations between them. This structure, usually set up from prior knowledge, leads to a path diagram in which the block’s components are linked to one another. Until path modeling has gained popularity in recent decades, the issue of prediction has been surprisingly neglected in the literature. To fill this gap, we propose a prediction framework dedicated to the path modeling approach. Indeed, prediction for path modeling methods is of paramount interest for: (i) selecting the optimal model with cross-validation (e.g., optimal model specification and dimension), (ii) comparing the model with other relevant ones, (iii) or predicting new observations. Here, we propose an explicit formulation of a prediction model in the context of a path modeling structure. This generic prediction model is estimated using the PLS Path Modeling algorithm, denoted PMM-PLSPM (=Prediction Model for Multiblock data estimated by PLS Path Modeling). Nevertheless, its versatility makes it possible to adapt the approach to any other composite-based methods. Two estimation strategies (PLSPM-like and PLSR-like) are proposed for prediction, both being viewed as the structural model at the variable level. At the same time, the components of multiple blocks – obtained by a relevant deflation strategy – are handled to take advantage of the full multidimensional potential of blocks. The proposed prediction strategies are applied and evaluated on real data sets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call