Abstract

Multi-set multivariate data analysis methods provide a way to analyze a series of tables together. In particular, the STATIS-dual method is applied in data tables where individuals can vary from one table to another, but the variables that are analyzed remain fixed. However, when you have a large number of variables or indicators, interpretation through traditional multiple-set methods is complex. For this reason, in this paper, a new methodology is proposed, which we have called Sparse STATIS-dual. This implements the elastic net penalty technique which seeks to retain the most important variables of the model and obtain more precise and interpretable results. As a complement to the new methodology and to materialize its application to data tables with fixed variables, a package is created in the R programming language, under the name Sparse STATIS-dual. Finally, an application to real data is presented and a comparison of results is made between the STATIS-dual and the Sparse STATIS-dual. The proposed method improves the informative capacity of the data and offers more easily interpretable solutions.

Highlights

  • Classic methods of multivariate analysis operate with two-way data [1], whose rows and columns collect, in a data matrix, the information provided by individuals and variables, respectively When this matrix is analyzed, all the variables are considered at the same time and, the information extracted represents a global vision of the system [2,3]

  • To expose the main aspects of the STATIS-dual and the Sparse STATIS-dual, and to recognize the usefulness of both methods in the analysis of three-way data, we used panel data (2016–2020) from the Global Innovation Index [26], which integrates 80 global innovation indicators in more than 130 economies. This index captures the multidimensional facets of innovation between countries, and supports the monitoring of innovation factors that allow the formulation of more effective public policies for society and the world economy

  • One of the most important areas of current research in multivariate data analysis focuses on the development of efficient techniques for the study of large data matrices [22,58,59]

Read more

Summary

Introduction

On many occasions, experiments are designed in which the variables are examined at different moments in time, giving rise to the application of multivariate data analysis techniques in three modes [4,5] In this way, the organization of data in three ways is constituted by a first index to identify the individuals under study, a second index for the variables that are measured on said individuals, and a third index for the various situations (moments) in which the measurements are made [6]. The integration of a third way is to analyze the similarities and differences between the different situations through the configurations of the individuals and the relationships between the groups of variables Following this concept, Kiers [7] classifies the three-way data into three-way data and multiple-set data. He defines three-way data as a set of data corresponding to the observations of all objects in all variables and on all occasions, and data from multiple sets as observations on different sets of objects and/or variables at different times [8,9]

Objectives
Methods
Findings
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.