Principal Component Analysis-Based Data Clustering for Labeling of Level Damage Sector in Post-Natural Disasters

Agung Teguh Wibowo Almais,Imam Tazi,Mokhamad Amin Hariyadi,Mohammad Singgih Purwanto,Supriyono Supriyono,Yunifa Miftachul Arif,Agus Naba,Diyan Parwatiningtyas,Moechammad Sarosa,Hendro Wicaksono,Muhammad Aziz Muslim,Cahyo Crysdian,Adi Susilo,Puspa Miladin Nuraida Safitri Abdul Basid

doi:10.1109/access.2023.3275852

Abstract

Post-disaster sector damage data is data that has features or criteria in each case the level of damage to the post-natural disaster sector data. These criteria data are building conditions, building structures, building physicals, building functions, and other supporting conditions. Data on the level of damage to the post-natural disaster sector used in this study amounted to 216 data, each of which has 5 criteria for damage to the post-natural disaster sector. Then the 216 post-disaster sector damage data were processed using Principal Component Analysis (PCA) to look for labels in each data. The results of these labels will be used to cluster data based on the value scale of the results of data normalization in the PCA process. In the data normalization process at PCA, the data is divided into 2 components, namely PC1 and PC2. Each component has a variance ratio and eigenvalue generated in the PCA process. For PC1 it has a variance ratio of 85.17% and an eigenvalue of 4.28%, while PC2 has a variance ratio of 9.36% and an eigenvalue of 0.47%. The results of the data normalization are then made into a 2-dimensional graph to see the visualization of the PCA results data. The result is that there is 3 data cluster using a value scale based on the PCA results chart. The coordinate value (n) of each cluster is cluster 1 (n<0), cluster 2 (0 ≤n <2), and cluster 3 (n≥2). To test these 3 groups of data, it is necessary to conduct trials by comparing the original target data, there are two experiments, namely testing the PC1 results with the original target data, and the PC2 results with the original target data. The result is that there are 2 updates, the first is that the distribution of PC1 data is very good in grouping the data when comparing the distribution of data with PC2, because the variance ratio and eigenvalue values of PC1 are greater than PC2. While second, the results of testing the PC1 data with the original target data produce good data grouping, because the original target data which has a value of 1 (slightly damaged) occupies the coordinates of cluster 1 (n<0), while the original target data which has a value of 2 (damaged moderately) occupies cluster 2 coordinates (0 ≤n <2), and for the original target data the value 3 (heavily damaged) occupies cluster 3 coordinates (n≥2). Therefore, it can be concluded that PCA, which so far has been used by many studies as feature reduction, this study uses PCA for labeling unsupervised data so that it has an appropriate data label for further processing.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Principal Component Analysis-Based Data Clustering for Labeling of Level Damage Sector in Post-Natural Disasters

Abstract

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Journal: IEEE Access	Publication Date: Jan 1, 2023
License type: CC BY-NC-ND 4.0

Similar Papers

Author response: Limitations of principal components in quantitative genetic association models for human studies
Yiqi Yao ... Alejandro Ochoa
-
Yiqi Yao, et. al.Yiqi Yao ... Alejandro Ochoa
25 Apr 2023
25 Apr 2023

Decision letter: Limitations of principal components in quantitative genetic association models for human studies
Magnus Nordborg ... Detlef Weigel
-
Magnus Nordborg, et. al.Magnus Nordborg ... Detlef Weigel
04 Jul 2022
04 Jul 2022

Editor's evaluation: Limitations of principal components in quantitative genetic association models for human studies
Magnus Nordborg
-
Magnus NordborgMagnus Nordborg
04 Jul 2022
04 Jul 2022

Author response: Sparse dimensionality reduction approaches in Mendelian randomisation with highly correlated exposures
Vasileios Karageorgiou ... Verena Zuber
-
Vasileios Karageorgiou, et. al.Vasileios Karageorgiou ... Verena Zuber
28 Nov 2022
28 Nov 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Principal Component Analysis-Based Data Clustering for Labeling of Level Damage Sector in Post-Natural Disasters

Abstract

Talk to us

Similar Papers

More From: IEEE Access