Abstract

Principal component analysis (PCA) is a popular dimension-reduction method to reduce the complexity and obtain the informative aspects of high-dimensional datasets. When the data distribution is skewed, data transformation is commonly used prior to applying PCA. Such transformation is usually obtained from previous studies, prior knowledge, or trial-and-error. In this work, we develop a model-based method that integrates data transformation in PCA and finds an appropriate data transformation using the maximum profile likelihood. Extensions of the method to handle functional data and missing values are also developed. Several numerical algorithms are provided for efficient computation. The proposed method is illustrated using simulated and real-world data examples. Supplementary materials for this article are available online.

Highlights

  • Principal component analysis (PCA) and its extensions are commonly used dimension reduction techniques that transform a collection of correlated variables into a small number of uncorrelated variables called principal components

  • In the functional PCA (FPCA), some regularization is needed to take into account the underlying smoothness of the functional data

  • We present the functional version of PCA.t algorithm, referred to as FPCA.t, as follows: 1. Start from an initial estimate of η, denoted by η0 (t = 0)

Read more

Summary

Introduction

Principal component analysis (PCA) and its extensions are commonly used dimension reduction techniques that transform a collection of correlated variables into a small number of uncorrelated variables called principal components. Two common approaches of motivating the PCA are (a) finding a small number of linear combinations of the variables that account for most of the variance in the observed data (Hotelling, 1933); and (b) obtaining best low-rank matrix approximation of the data matrix (Pearson, 1901; Jolliffe, 2002, Section 3.5) The transformation parameters can be estimated using the maximum profile likelihood The details of the proposed methods including computational algorithms are given in Sections 2 and 3, which treat the ordinary and functional data structure respectively. The appendix contains detailed derivations of the algorithms presented in the main text

Integrating the data transformation to PCA by profile likelihood
Handling missing data
Integrating the data transformation in functional PCA
Choosing the penalty parameter
Missing data and the functional data structure
Simulation
Real data
Call Center Data
Findings
Discussion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.