Accelerate Literature Icon
Want to do a literature review? Try our new Literature Review workflow

Overview of principal component analysis algorithm

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Overview of principal component analysis algorithm

Similar Papers
  • Conference Article
  • Cite Count Icon 14
  • 10.1109/ftc.2016.7821659
Performance evaluation of the PCA versus improved PCA (IPCA) in image compression, and in face detection and recognition
  • Dec 1, 2016
  • Abdulaziz A Alorf

A principal components analysis (PCA) algorithm is one of the most important algorithms that has been used for doing many tasks; for example, data dimension reduction, data compression such as image compression, pattern recognition such as face detection and recognition, and many other things. An improved principal components analysis (IPCA) algorithm is similar to the PCA algorithm except that it uses the concepts of Shannon information theory for improving the PCA algorithm. It has been claimed that the IPCA algorithm behaves better than the PCA algorithm. Due to the huge importance of the PCA algorithm where it is commonly used, we were motivated to theoretically and empirically compare the behavior of the PCA and IPCA algorithms in different applications. This paper validates the IPCA algorithm on images for the first time where it has not been tested on images before. It also proposes a new learning method for face detection and recognition using the PCA and IPCA algorithms. In addition, this paper evaluates the performance of the PCA algorithm versus IPCA algorithm in image compression, and in face detection and recognition where we have obtained rigorous decision about which algorithm behaves better in each area. We have also proposed a new method for evaluating the performance of the PCA and IPCA algorithms in image compression based on three measures. We finally have proposed to use another segmentation method with the algorithms in order to center and normalize only pixels that occupy faces for obtaining better performance. Lastly, the MATLAB software has been used for performing our experiments. We have found that the PCA algorithm, in general, behaves better than the IPCA algorithm in the most of the areas. It is better than the IPCA algorithm in face detection and recognition. The PCA algorithm is slightly slower than the IPCA algorithm, but it has significant small error rates. Also, it is easier in computation than the IPCA algorithm. On the other hand, the IPCA algorithm is better than the PCA algorithm in image compression because it obtains higher compression, more accurate reconstruction, and faster processing speed with acceptable errors.

  • Book Chapter
  • Cite Count Icon 9
  • 10.5772/9353
Bi-2DPCA: A Fast Face Coding Method for Recognition
  • Feb 1, 2010
  • Jian Yang + 2 more

Face recognition has received significant attention in the past decades due to its potential applications in biometrics, information security, law enforcement, etc. Numerous methods have been suggested to address this problem [1]. Among appearance-based holistic approaches, principal component analysis (PCA) turns out to be very effective. As a classical unsupervised learning and data analysis technique, PCA was first used to represent images of human faces by Sirovich and Kirby in 1987 [2, 3]. Subsequently, Turk and Pentland [4, 5] applied PCA to face recognition and presented the well-known Eigenfaces method in 1991. Since then, PCA has been widely investigated and has become one of the most successful approaches to face recognition [6-15]. PCA-based image representation and analysis technique is based on image vectors. That is, before applying PCA, the given 2D image matrices must be mapped into 1D image vectors by stacking their columns (or rows). The resulting image vectors generally lead to a highdimensional image vector space. In such a space, calculating the eigenvectors of the covariance matrix is a critical problem deserving consideration. When the number of training samples is smaller than the dimension of images, the singular value decomposition (SVD) technique is useful for reducing the computational complexity [1-4]. However, when the training sample size becomes large, the SVD technique is helpless. To deal with this problem, an incremental principal component analysis (IPCA) technique has been proposed recently [16]. But, the efficiency of this algorithm still depends on the distribution of data. Over the last few years, two PCA-related methods, independent component analysis (ICA) [17] and kernel principal component analysis (KPCA) [18, 19] have been of wide concern. Bartlett [20], Yuen [21], Liu [22], and Draper [23] proposed using ICA for face representation and found that it was better than PCA when cosine was used as the similarity measure (however, the performance difference between ICA and PCA was not significant if the Euclidean distance is used [23]). Yang [24] and Liu [25] used KPCA for face feature extraction and recognition and showed that KPCA outperforms the classical PCA. Like PCA, ICA and KPCA both follow the matrix-to-vector mapping strategy when they are used for image analysis and, their algorithms are more complex than PCA. So, ICA and KPCA are considered to be computationally more expensive than PCA. The experimental results in 16

  • Dissertation
  • Cite Count Icon 2
  • 10.37099/mtu.dc.etds/743
Comparison of Computer-Based and Optical Face Recognition Paradigms
  • Jan 1, 2014
  • Abdulaziz A Alorf

The main objectives of this thesis are to validate an improved principal components analysis (IPCA) algorithm on images; designing and simulating a digital model for image compression, face recognition and image detection by using a principal components analysis (PCA) algorithm and the IPCA algorithm; designing and simulating an optical model for face recognition and object detection by using the joint transform correlator (JTC); establishing detection and recognition thresholds for each model; comparing between the performance of the PCA algorithm and the performance of the IPCA algorithm in compression, recognition and, detection; and comparing between the performance of the digital model and the performance of the optical model in recognition and detection. The MATLAB © software was used for simulating the models. PCA is a technique used for identifying patterns in data and representing the data in order to highlight any similarities or differences. The identification of patterns in data of high dimensions (more than three dimensions) is too difficult because the graphical representation of data is impossible. Therefore, PCA is a powerful method for analyzing data. IPCA is another statistical tool for identifying patterns in data. It uses information theory for improving PCA. The joint transform correlator (JTC) is an optical correlator used for synthesizing a frequency plane filter for coherent optical systems. The IPCA algorithm, in general, behaves better than the PCA algorithm in the most of the applications. It is better than the PCA algorithm in image compression because it obtains higher compression, more accurate reconstruction, and faster processing speed with acceptable errors; in addition, it is better than the PCA algorithm in real-time image detection due to the fact that it achieves the smallest error rate as well as remarkable speed. On the other hand, the PCA algorithm performs better than the IPCA algorithm in face recognition because it offers an acceptable error rate, easy calculation, and a reasonable speed. Finally, in detection and recognition, the performance of the digital model is better than the performance of the optical model.

  • Research Article
  • 10.32493/jtsi.v7i2.38935
Penerapan Principal Component Analysis pada Model Deteksi Dini Anak Autisme
  • Apr 30, 2024
  • Jurnal Teknologi Sistem Informasi dan Aplikasi
  • Aries Saifudin + 2 more

ASD (Autism Spectrum Disorder) is a neurological disorder that causes lifelong disturbances in children resulting in mental illness. Treatment can help but cannot be cured. Currently ASD is detected by understanding a child's behavior and intellectual activity. This diagnosis can be subjective, time-consuming, inconclusive, does not provide precise insight into genetics and is unsuitable for early detection. In Autism, a major challenge faced in many healthcare conditions is timing of diagnosis. It can take up to 6 months to diagnose a child with autism with certainty because of the lengthy process, and a child must see many different specialists to diagnose autism, from a developmental pediatrician, neurologist, psychiatrist or psychologist. Machine Learning Methods can make relevant changes to speed up the process. In this study, it is proposed to apply PCA (Principal Component Analysis). PCA is basically the basis of multivariate data analysis that applies the projection method. This analysis technique is usually used to summarize multivariate data tables on a large scale so that they can be used as a collection of smaller variables or a summary index. From there, variables are then analyzed to find out certain trends, variable clusters, and outliers. In this study it is proposed to implement the PCA (Principal Component Analysis) algorithm, namely PCA (Principal Component Analysis), Kernel PCA, Sparse PCA, and Incremental PCA. In this study using the experimental method by making applications to implement the proposed algorithm. Then test the model using the secondary dataset and measure the performance of the model. The research results show that the model that applies Sparse PCA gives the best results, which means that the application of PCA can be used to reduce the number of features and increase model performance.

  • Research Article
  • Cite Count Icon 5
  • 10.47893/ijess.2012.1053
An Improved Face Recognition Using Neighborhood Defined Modular Phase Congruency Based Kernel PCA
  • Apr 1, 2012
  • International Journal of Electronics Signals and Systems
  • M.Lokeswara Reddy + 1 more

A face recognition algorithm based on NMPKPCA algorithm presented in this paper. The proposed algorithm when compared with conventional Principal component analysis (PCA) algorithms has an improved recognition Rate for face images with large variations in illumination, facial expressions. In this technique, first phase congruency features are extracted from the face image so that effects due to illumination variations are avoided by considering phase component of image. Then, face images are divided into small sub images and the kernel PCA approach is applied to each of these sub images. but, dividing into small or large modules creates some problems in recognition. So a special modulation called neighborhood defined modularization approach presented in this paper, so that effects due to facial variations are avoided. Then, kernel PCA has been applied to each module to extract features. So a feature extraction technique for improving recognition accuracy of a visual image based facial recognition system presented in this paper.

  • Conference Article
  • Cite Count Icon 79
  • 10.1109/icip.2003.1246944
An integrated algorithm of incremental and robust PCA
  • Nov 24, 2003
  • Y Li + 3 more

Principal component analysis (PCA) is a well-established technique in image processing and pattern recognition. Incremental PCA and robust PCA are two interesting problems with numerous potential applications. However, these two issues have only been separately addressed in the previous studies. In this paper, we present a novel algorithm for incremental and robust PCA by seamlessly integrating the two issues together. The proposed algorithm has the advantages of both incremental PCA and robust PCA. Moreover, unlike most M-estimation based robust algorithms, it is computational efficient. Experimental results on dynamic background modelling are provided to show the performance of the algorithm with a comparison to the conventional batch-mode and nonrobust algorithms.

  • Research Article
  • Cite Count Icon 1
  • 10.3724/sp.j.1087.2012.02316
Face recognition algorithm based on multi-level texture spectrum features and PCA
  • May 7, 2013
  • Journal of Computer Applications
  • Xin-Peng Dang + 1 more

To improve the recognition rate of Principal Component Analysis(PCA) algorithm in face recognition,a new algorithm combining the image texture spectrum feature with PCA was proposed.Firstly,the texture unit operator was used to extract the texture spectrum feature of the face image.Secondly,PCA approach was used to reduce the dimensions of the texture spectrum feature.Finally,K-Nearest Neighbor(KNN) classification was chosen to recognize the face.ORL and Yale face database were used to test the proposed algorithm,and the recognition accuracies were 96.5% and 95% respectively,which were higher than those of PCA and Modular Two-Dimensional PCA(M2DPCA).The experimental results demonstrate the efficiency and accuracy of the proposed algorithm.

  • Book Chapter
  • Cite Count Icon 6
  • 10.5772/9367
Non-Linear Feature Extraction by Linear Principal Component Analysis Using Local Kernel
  • Feb 1, 2010
  • Kazuhiro Hotta

In the last decade, the effectiveness of kernel-based methods for object detection and recognition have been reported Fukui et al. (2006); Hotta (2008c); Kim et al. (2002); Pontil & Verri (1998); Shawe-Taylor & Cristianini (2004); Yang (2002). In particular, Kernel Principal Component Analysis (KPCA) took the place of traditional linear PCA as the first feature extraction step in various researches and applications. KPCA can cope with non-linear variations well. However, KPCAmust solve the eigen value problem with the number of samples × the number of samples. In addition, the computation of kernel functions with all training samples are required to map a test sample to the subspace obtained by KPCA. Therefore, the computational cost is the main drawback. To reduce the computational cost of KPCA, sparse KPCA Tipping (2001) and the use of clustering Ichino et al. (2007 (in Japanese) were proposed. Ichino et al. Ichino et al. (2007 (in Japanese) reported that KPCA of cluster centers is more effective than sparse KPCA. However, the computational cost becomes a big problem again when the number of classes is large and each class has one subspace. For example, KPCA of visual words (cluster centers of local features) Hotta (2008b) was effective for object categorization but the computational cost is high. In this method, each category of 101 categories has one subspace constructed by 400 visual words. Namely, 40, 400 (= 101 categorizes × 400 visual words) kernel computations are required to map a local feature to all subspaces. On the other hand, traditional linear PCA is independent of the number of samples when the dimension of a feature is smaller than the number of samples. This is because the size of eigen value problem depends on the minimum number of the feature dimension and the number of samples. To map a test sample to a subspace, only inner products between basis vectors and the test sample are required. Therefore, in general, the computational cost of linear PCA is much lower than KPCA. In this paper, we propose how to use non-linearity of KPCA and computational cost of linear PCA simultaneously Hotta (2008a). Kernel-based methods map training samples to high dimensional space as x → φ(x). Nonlinearity is realized by linear method in high dimensional space. The dimension of mapped feature space of the Radial Basis Function (RBF) kernel becomes infinity, and we can not describe the mapped feature explicitly. However, the mapped feature φ(x) of the polynomial kernel can be described explicitly. This means that KPCA with the polynomial kernel can be solved directly by linear PCA of mapped features. Unfortunately, in general, the dimension of mapped features is too high to solve by linear PCA even if the polynomial kernel with 2nd degrees K(x, y) = (1+ xTy)2 is used. The dimension of mapped features of the polynomial 5

  • Research Article
  • Cite Count Icon 13
  • 10.5075/epfl-thesis-2189
Mixture Models for Unsupervised and Supervised Learning
  • Jan 1, 2000
  • Infoscience (Ecole Polytechnique Fédérale de Lausanne)
  • Perry D Moerland

In a society which produces and consumes an ever increasing amount of information, methods which can make sense out of all this data become of crucial importance. Machine learning tries to develop models which can make the information load accessible. Three important questions one can ask when constructing such models are: - What is the structure of the data? This is especially relevant for high-dimensional data which cannot be visualized anymore. - Which features are most characteristic? -How to predict whether a pattern belongs to one class or to another? This thesis investigates these three questions by trying to construct complex models from simple ones. The decomposition into simpler parts can also be found in the methods used for estimating the parameter values of these models. The algorithms for the simple models constitute the core of the algorithms for the complex ones. The above questions are addressed in three stages: Unsupervised learning: This part deals with the problem of probability density estimation with the goal of finding a good probabilistic representation of the data. One of the most popular density estimation methods is the Gaussian mixture model (GMM). A promising alternative to GMMs are the recently proposed mixtures of latent variable models. Examples of the latter are principal component analysis (PCA) and factor analysis. The advantage of these models is that they are capable of representing the covariance structure with less parameters by choosing the dimension of a subspace in a suitable way. An empirical evaluation on a large number of data sets shows that mixtures of latent variable models almost always outperform GMMs. To avoid having to choose a value for the dimension of the subspace by a computationally expensive search technique such as cross-validation, a Bayesian treatment of mixtures of latent variable models is proposed. This framework makes it possible to determine the appropriate dimension during training and experiments illustrate its viability. Feature extraction: PCA is also (and foremost) a classic method for feature extraction. However, PCA is limited to linear feature extraction by a projection onto a subspace. Kernel PCA is a recent method which allows non-linear feature extraction. Applying kernel PCA to a data set with N patterns requires finding the eigenvectors of an N*N matrix. An Expectation-Maximization (EM) algorithm for PCA which does not need to store this matrix is adapted to kernel PCA and applied to large data sets with more than 10,000 examples. The experiments confirm that this approach is feasible and that the extracted features lead to good performance when used as pre-processed data for a linear classifier. A new on-line variant of the EM algorithm for PCA is presented and shown to speed up the learning process. Supervised learning: This part illustrates two ways of constructing complex models from simple ones for classification problems. The first approach is inspired by unsupervised mixture models and extends them to supervised learning. The resulting model, called a mixture of experts, tries to decompose a complex problem into subproblems treated by several simpler models. The division of the data space is effectuated by an input-dependent gating network. After a review of the model and existing training methods, different possible gating networks are proposed and compared. Unsupervised mixture models are one of the evaluated options. The experiments show that a standard mixture of experts with a neural network gate gives the best results. The second approach is a constructive algorithm called boosting which creates a committee of simple models by emphasizing patterns which have been frequently misclassified by the preceding classifiers. A new model has been developed which lies between a mixture of experts and a boosted committee. It adds an input-dependent combiner (like a gating network) to standard boosting. This has the advantage that with a considerably smaller committee results are obtained which are comparable to those of boosting. Finally, some of the investigated models have been evaluated on two problems of machine vision. The results confirm the potential of mixtures of latent variable models which lead to good performance when incorporated in a Bayes classifier.

  • Conference Article
  • Cite Count Icon 29
  • 10.1109/icact.2008.4494037
Efficiency Improvement for Unconstrained Face Recognition by Weightening Probability Values of Modular PCA and Wavelet PCA
  • Feb 1, 2008
  • Wayo Puyati + 1 more

Principal component analysis (PCA) is a well-known classical appearance-base method in face recognition. In the previous works, the preprocessing process significantly improved the recognition rate. Modular PCA and Wavelet PCA are the preprocessing processes of PCA, which increase the recognition rate of the original PCA. Modular PCA is suitable for the high- varied face database, while Wavelet PCA for the low-varied face database. In this paper, we propose the preprocessing method which combines between Modular PCA and Wavelet PCA with the weightening probability values. The experiments are compared among our propose method, Modular PCA, Wavelet PCA and original PCA with face database from Yale, ORL and UMIST. The experimental results show that the recognition rate of our method is higher compared to the other methods and also support variety of face database.

  • Conference Article
  • Cite Count Icon 2
  • 10.1145/3378065.3378079
Research Based on Improving PCA Face Recognition
  • Nov 16, 2019
  • Xu Henghui + 2 more

The Principal Component Analysis (PCA) algorithm is widely used in the field of face recognition because of its high recognition rate and simplicity. The PCA algorithm is based on the principle of Karhunen-Loeve Transformation, because the PCA algorithm is sensitive to outliers, it is improved on the basis of PCA algorithm, combined with Linear Discriminant Analysis (LDA) algorithm, the PCA-LDA face recognition method is proposed. This method obtains the feature space of training sample set by PCA algorithm, On this basis, the LDA algorithm is executed to obtain the feature space of fusion. The PCA is then fused with the LDA's feature space to obtain the new feature space that combines the two. Finally, the face projected in the new feature space is trained and recognized. The experimental results show that the face recognition algorithm proposed in this paper has a higher recognition rate than the traditional PCA algorithm.

  • Conference Article
  • Cite Count Icon 4
  • 10.1109/cisp-bmei.2016.7852755
Recognition of partially occluded face by error detection with logarithmic operator and KPCA
  • Oct 1, 2016
  • Xiaolin Chen + 2 more

Occluded images often affected the recognition rates in face recognition, thus the occlusion should be checked out and given a little weighting coefficient so as to weaken its impact on the recognition rate as much as possible. The traditional algorithms often use the reconstruction error operator based on principal component analysis (PCA) to estimate the weight for occluded face, which often need iterative computation and consequently high time complexity. Moreover there are many difficulties for noise problems and threshold selection in iteration. In order to overcome the shortcoming, this paper suggested a new research method of face recognition on the basis of error detection and kernel principal component analysis (KPCA) algorithm under partial occlusion. First, face images were divided into several regions. Then, the weight of each area is calculated by a new logarithmic transformed Gaussian error operator instead of the traditional error operator. Finally, each region's features are extracted by KPCA, which is a nonlinear transformation and mapping method. All features are fused with their weights to realize the final classification and recognition. The experimental results with the AR facial database indicated that the new algorithm was of great robustness and validity.

  • Conference Article
  • 10.3390/engproc2024068062
Decomposing the Sri Lanka Yield Curve Using Principal Component Analysis to Examine the Term Structure of the Interest Rate
  • Aug 27, 2024
  • K P N Sanjeewa Dayarathne + 1 more

In this study, we delve into the dynamics of the Sri Lankan government bond market, building upon prior research that focused on the application of principal component analysis (PCA) in modelling sovereign yield curves. Our analysis encompasses data spanning from January 2010 to August 2022. The study applied several PCA variants such as multivariate PCA, Randomized PCA, Incremental PCA, Sparse PCA, Functional PCA, and Kernel PCA on smoothed data. Kernel PCA was found to explain the majority of the variation associated with the data. Findings reveal that the first principal component accounted for a substantial 97.69% of the variations in yield curve movements, 2nd PCA accounted for 1.88%, and 3rd for 0.42%. These results align with previous research, which generally posits that the initial three principal components tend to elucidate around 95% of the fluctuations within the term structure of yields. Our results question the empirical findings, which state that the 1st PCA represents the longer tenor of the yield curve. In Sri Lanka, instead, the 1st PCA represents the 3-year bond yields. It may be because of the liquidity constraints in underdeveloped frontier markets, where longer tenor yields do not react fast enough to reflect the movement of the yield curve. The 2nd PCA represents the slope of the yield curve which is the yield difference of a 10-year T-Bond and 3 months T-Bill. The 3rd PCA which represents the curvature of the yield curve attributed to 2 × 3 years T-Bond yield—3 months T-bill10-year T-Bond.

  • Research Article
  • Cite Count Icon 2
  • 10.1021/jasms.4c00314
Processing Next-Generation Mass Spectrometry Imaging Data: Principal Component Analysis at Scale.
  • Oct 28, 2024
  • Journal of the American Society for Mass Spectrometry
  • Kasper Krijnen + 3 more

Mass spectrometry imaging (MSI) is constantly improving in spatial resolving power, throughput and mass resolution. Although beneficial, these improvements increase data set size and content. The larger data requires correspondingly fast computer-based analyses. However, these analyses often do not scale well with increased data size. Principal component analysis (PCA) is an important analytical tool commonly used with MSI data; however, most PCA algorithms load and process the entire data set within random access memory (RAM) which is most often insufficient for large data sets. PCA algorithms that use less RAM than the data set exist but are usually much slower or sacrifice precision and are rarely used for MSI data processing. Incremental PCA (IPCA) is an alternative algorithm that avoids large RAM allocations while also preserving speed and analytical precision. Here, we demonstrate and benchmark the use of differing implementations of IPCA, PCA, and commercial software on large and often complex MSI data sets. We show that using an already-published Python-based IPCA algorithm, IPCA can be successfully applied to MSI data sets too large to fit with RAM. Furthermore, our benchmarks demonstrate that, contrary to expectations, IPCA is faster than all other tested PCA implementations on all large data sets that can be directly compared.

  • Research Article
  • Cite Count Icon 10
  • 10.1016/j.patrec.2017.11.023
Sparse kernel feature extraction via support vector learning
  • Nov 28, 2017
  • Pattern Recognition Letters
  • Kunzhe Wang + 1 more

Sparse kernel feature extraction via support vector learning

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant