Chapter 15 - Principal Component Analysis

Mamdouh Refaat

doi:10.1016/b978-012373577-5/50017-x

Abstract

Principal component analysis (PCA) is one of the oldest and most used methods for the reduction of multidimensional data. The basic idea of PCA is to find a set of linear transformations of the original variables such that the new set of variables could describe most of the variance in a relatively fewer number of variables. The new set of variables is presented, and actually derived, in a decreasing order of contribution. Additionally, the first new variable that is known as the first principal component contains the largest proportion of the variance of the original variable set, and the second principal component contains less proportion. There are two main methods for performing PCA of a set of data. The first one involves working with the variance–covariance matrix. However, the values included in this matrix depend on the units and magnitude of each variable. The chapter presents the implementation of both the methods. Later, the theoretical background of PCA is presented. The chapter also provides the macro for SAS implementation of PCA with an example. Finally, a modified macro that selects the most contributing variables containing the required percentage of the variance is presented.

Full Text