Abstract

Principal component analysis is a one-sample technique applied to data with no groupings among the observations and no partitioning of the variables into subvectors y and x. Principal components are concerned only with the core structure of a single sample of observations on p variables. In principal component analysis, we seek to maximize the variance of a linear combination of the variables. For example, the first principal component could be used to rank students on the basis of their scores on achievement tests in English, mathematics, reading, and so on. An average score would provide a single scale on which to compare the students, but with unequal weights in the principal component, we can spread the students out further on the scale and obtain a better ranking. The first principal component is the linear combination with maximal variance. The second principal component is the linear combination with maximal variance in a direction orthogonal to the first principal component, and so on. The first principal component also represents the line that minimizes the total sum of squared perpendicular distances from the points to the line. Principal components are often used as a dimension reduction device to obtain a smaller number of variables for input to another analysis. Another useful dimension reduction device is to evaluate the first two principal components for each observation vector and construct a scatter plot to check for multivariate normality, outliers, and so on. The properties of principal components can be interpreted either geometrically or algebraically. Principal components are orthogonal because they are formed with eigenvectors of the covariance (or correlation) matrix, which is symmetric. Principal components are not scale invariant because a scale chance in one of the variables leads to a change in the shape of the swarm of points in the sample. Since principal components are not scale invariant, the components extracted from a covariance matrix differ from those obtained from the corresponding correlation matrix. In fact, the components of a given correlation matrix will serve for other correlation matrices. There are four common methods that can be used to decide how many components to retain in order to effectively summarize the data. Three of the four methods are based on the eigenvalues of the covariance matrix (or correlation matrix). Because of the lack of scale invariance of principal components from the covariance matrix, the coefficients cannot be converted to standardized form, as can be done with coefficients in discriminant functions in Chapter 8 and canonical variates in Chapter 11. Hence we use the coefficients themselves for interpretation. We must choose between the covariance matrix or the correlation matrix, knowing they will yield a different interpretation. One aid to interpretation is to note that for certain patterns of elements in the covariance matrix or the correlation matrix, the form of the principal components can be predicted. For example, if one variable has a much larger variance than the other variables, this variable will dominate the first component, which will account for most of the variance. Another case in which a component will duplicate a variable occurs when the variable is uncorrelated with the other variables. In some settings, the principal components can be interpreted as measures of size and shape. Various methods of selecting a subset of variables are discussed. Since there is no grouping variable or dependent variable in the setting of principal components, we wish to find the subst that best captures the internal variation and covariation of the variables. As usual, examples and problems amply illustrate the techniques in this chapter.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call