Abstract

Different versions of principal component analysis (PCA) have been widely used to extract important information for image recognition and image clustering problems. However, owing to the presence of outliers, this remains challenging. This paper proposes a new PCA methodology based on a novel discovery that the widely used l1-PCA is equivalent to a two-groups k-means clustering model. The projection vector of the l1-PCA is the vector difference between the two cluster centers estimated by the clustering model. In theory, this vector difference provides inter-cluster information, which is beneficial for distinguishing data objects from different classes. However, the performance of l1-PCA is not comparable with the state-of-the-art methods. This is because the l1-PCA can be sensitive to outliers, as the equivalent clustering model is not robust to outliers. To overcome this limitation, we introduce a trimming function to the clustering model and propose a trimmed-clustering based l1-PCA (TC-PCA). With this trimming set formulation, the TC-PCA is not sensitive to outliers. Besides, we mathematically prove the convergence of the proposed algorithm. Experimental results on image classification and clustering indicate that our proposed method outperforms the current state-of-the-art methods.

Highlights

  • Image classification and clustering problems are topics fundamental to various areas of machine learning [1,2,3] including image recognition and image clustering

  • We show that the l1 -Principal component analysis (PCA) is equivalent to the two-group k-means clustering model

  • We show that the projection vectors estimated by the proposed trimmed-clustering based l1 -PCA (TC-PCA) can provide inter-cluster information that is beneficial to classification and clustering problems

Read more

Summary

Introduction

Image classification and clustering problems are topics fundamental to various areas of machine learning [1,2,3] including image recognition and image clustering. A good dimension reduction method can effectively extract the key facial features and, at the same time, ignore the occluded information Another example is the image clustering problem. Similar to the above face recognition problem, the corrupted and noisy information such as scarves can make two facial images very different even they are taken from the same person. This makes the clustering task very challenging. The projection vector of the l l1 -PCA represents the inter-cluster direction, that is beneficial to distinguish data objects from. PCA represents the inter-cluster direction, that is beneficial to distinguish data objects from different classes.

Related Work
A Trimmed-Clustering Based l1 -PCA Model
Relating l1 -PCA to Two-Groups K-Means Clustering
The Proposed Model
The Overall Implementation
Mathematical Properties
Synthetic Analysis
Experiments
ImageWe
Parameter Study
Effectiveness of the Trimming Parameter
Sensitivity Analysis of Cluster Centers Initialization of the SPVEA
Discussions for Large Number of Outliers
Findings
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call