Hash-tree PCA: accelerating PCA with hash-based grouping

Lkhagvadorj Battulga,Kwan-Hee Yoo,Aziz Nasridinov,Sang-Hyun Lee

doi:10.1007/s11227-019-02947-x

Hash-tree PCA: accelerating PCA with hash-based grouping

Lkhagvadorj Battulga, Kwan-Hee Yoo + Show 2 more

Open Access

https://doi.org/10.1007/s11227-019-02947-x

Copy DOI

Journal: The Journal of Supercomputing	Publication Date: Jul 11, 2019
Citations: 3

Affiliation: Eindhoven University of Technology, Chungbuk National University

#Principal Component Analysis #Hash Table + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

In data mining or machine learning, one of the most commonly used feature extraction techniques is principal component analysis (PCA). However, it performs poorly on a large dataset. In this paper, we propose a new method of accelerating conventional PCA, named hash-tree PCA. It samples the objects that are similar to each other without losing the original data distribution. First, it explores similar objects and stores them in hash tables. Afterward, it samples a certain number of the objects from each hash table and creates a new dataset with a reduced number of objects. Finally, it executes PCA on the sampled dataset. Experimental results show that our method outperforms the PCA and fast PCA methods.

Full Text