Abstract

The local Principal Component Analysis (PCA) reduces linearly redundant components that may present in higher dimensional space. It deploys an initial guess technique which can be utilized when the distribution of a given multivariate data is known to the user. The problem in initialization arises when the distribution is not known. This study explores a technique that can be easily integrated in the local PCA design and is efficient even when the given statistical distribution is unknown. The initialization using this proposed splitting technique not only splits and reproduces the mean vector but also the orientation of components in the subspace domain. This would ensure that all clusters are used in the design. The proposed integration with the reconstruction distance local PCA design enables easier data processing and more accurate representation of multivariate data. A comparative approach is undertaken to demonstrate the greater effectiveness of the proposed approach in terms of percentage error.

Highlights

  • Dimension reduction methods are employed in statistical pattern classification problem to represent higher dimensional embeddings in a lower dimensional space by eliminating or removing the redundant components that may present in multivariate data so that the data loss is minimal

  • In vector quantization principal component analysis (VQPCA)-sp approach firstly, the set of feature vectors is separated into disjoint regions or clusters by applying vector quantization technique for each given class(There is a fundamental difference between a class and a cluster, class represents a set of feature vectors or parameters of a distinct element which can accommodate several clusters i.e. a cluster is a subset of a class

  • This study has described a new splitting technique on local Principal Component Analysis (PCA) approach (VQPCA) utilizing hybriddistance as a distance measure tool for cluster separation

Read more

Summary

Introduction

Dimension reduction methods are employed in statistical pattern classification problem to represent higher dimensional embeddings in a lower dimensional space by eliminating or removing the redundant components that may present in multivariate data so that the data loss is minimal. The interpretation of multivariate data or feature vectors becomes quite unmanageable when the dimension size is high. This severely increases the memory/storage requirements and augments the problems in pattern classification. The given data depend upon several characteristics; for example, in face recognition, the classification of faces depends upon the location of eyes, width and height of nose/mouth, length of eyebrows, complexion etc. All these characters constitute as one vector of a given multivariate data

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call