Abstract

formation explosion has occurred in most of the sciences and researches due to advances in data collection and storage capacity in last few decades. Advance datasets with large number of observations present new challenges in data, mining, analysis and classification. Traditional statistical method breaks down partly because of the increase in the number of variables associated with each observation which is known as high dimensional data. Much of the data is highly redundant which can be ignored to extract features of dataset. The process of mapping of high dimensional data to lower dimensional space in such a way to discard uninformative variance from the dataset or finding subspace in which data can be easily detected is known as Dimensionality Reduction. In this paper, well known techniques of Dimensionality Reduction namely Principle Component Analysis (PCA) and Linear Discriminant Analysis (LDA) are studied. Performance analysis is carried out on high dimensional data set UMIST, COIL and YALE which consists of images of objects and human faces. Classify the objects using knn classifier and naive bayes classifier to compare performance of these techniques. Difference between supervised and unsupervised learning is also inferred using these results. KeywordsDimensionality reduction, KNN, LDA, PCA, naive bayes.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.