Abstract

Nowadays, data are generated in the world with high speed; therefore, recognizing features and dimensions reduction of data without losing useful information is of high importance. There are many ways to dimension reduction, including principal component analysis (PCA) method, which is by identifying effective dimensions in an acceptable level, reducing dimension of data. In the usual method of principal component analysis, data are usually normal, or we normalize data; then, the principal component analysis method is used. Many studies have been done on the principal component analysis method as a step of data preparation. In this paper, we propose a method that improves the principal component analysis method and makes data analysis easier and more efficient. Also, we first identify the relationships between the data by fitting the multivariate copula function to data and simulate new data using the estimated parameters; then, we reduce the dimensions of new data by principal component analysis method; the aim is to improve the performance of the principal component analysis method to find effective dimensions.

Highlights

  • In many real-world programs, reduction of high-volume data is of high importance and necessity as a prestage of data processing

  • We first use the copula function to study the correlation and relationships between data to determine and eliminate irrelevant properties and simulate new data using the estimated parameter; by using the principal component analysis (PCA) method, we reduce the dimensions of data [4,5,6]

  • Using Gaussian copula function, correlation values of variables would be obtained as follows: Considering Table 1, it is observed that correlation of variable X2 is lower than other variables; so, it would be eliminated at the first stage

Read more

Summary

Introduction

In many real-world programs, reduction of high-volume data is of high importance and necessity as a prestage of data processing. In data mining programs, dimensionality reduction is considered one of the most important stages to remove data redundancy, to increase precision of measurement, and to improve decision making process. A highly used method to reduce dimension reduction of data in data mining and in the data preparing phase is the principal component analysis method. We first use the copula function to study the correlation and relationships between data to determine and eliminate irrelevant properties and simulate new data using the estimated parameter; by using the PCA method, we reduce the dimensions of data [4,5,6]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call