Dimensionality Reduction and Classification through PCA and LDA

Telgaonkar Archanah,Deshmukh Sachin

doi:10.5120/21790-5104

Abstract

formation explosion has occurred in most of the sciences and researches due to advances in data collection and storage capacity in last few decades. Advance datasets with large number of observations present new challenges in data, mining, analysis and classification. Traditional statistical method breaks down partly because of the increase in the number of variables associated with each observation which is known as high dimensional data. Much of the data is highly redundant which can be ignored to extract features of dataset. The process of mapping of high dimensional data to lower dimensional space in such a way to discard uninformative variance from the dataset or finding subspace in which data can be easily detected is known as Dimensionality Reduction. In this paper, well known techniques of Dimensionality Reduction namely Principle Component Analysis (PCA) and Linear Discriminant Analysis (LDA) are studied. Performance analysis is carried out on high dimensional data set UMIST, COIL and YALE which consists of images of objects and human faces. Classify the objects using knn classifier and naive bayes classifier to compare performance of these techniques. Difference between supervised and unsupervised learning is also inferred using these results. KeywordsDimensionality reduction, KNN, LDA, PCA, naive bayes.

Full Text