Data analysis using representation theory and clustering algorithms

Suboh Alkhushayni,Du'A Alzaleq,Taeyoung Choi

doi:10.14419/ijet.v9i4.31234

Abstract

This work aims to expand the knowledge of the area of data analysis through persistence homology and representations of directed graphs. To be specific, we looked for how we can analyze homology cluster groups using agglomerative Hierarchical Clustering algorithms and methods. Additionally, the Wine data, which is offered in R studio, was analyzed using various cluster algorithms such as Hierarchical Clustering, K-Means Clustering, and PAM Clustering. The goal of the analysis was to find out which cluster's method is proper for a given numerical dataset. We tried to find the agglomerative hierarchical clustering method by testing the data that will be the optimal clustering algorithm among these three; K-Means, PAM, and Random Forest methods. By comparing each model's accuracy value with cultivar coefficients, we concluded that K-Means methods are the most helpful when working with numerical variables. On the other hand, PAM clustering and Gower with Random Forest are the most beneficial approaches when using categorical variables. These tests can determine the optimal number of clustering groups, given the data set, and by doing the proper analysis. Using those the project, we can apply our method to several industrial areas such that clinical, business, and others. For example, people can make different groups based on each patient who has a common disease, required therapy, and other things in the clinical society. Additionally, people can expect to get several clustered groups based on the marginal profit, marginal cost, or other economic indicators for the business area.

Highlights

As society continues to become more technologically advanced, data collection has become significantly easier and is done in almost every facet of life
When studying clusters of data using persistence homology, we look at varying scales and view how clusters of data combine into larger clusters or vanish as we increase our scale
We compared each of the accuracy values that obtained by using three different clusters

Summary

Introduction

As society continues to become more technologically advanced, data collection has become significantly easier and is done in almost every facet of life. We can collect data on nearly anything, from our favourite sports team's performance to the propagation of specific strains of the flu. Data collection isn't as much of a barrier as knowing how to interpret that data and find what is relevant in each data set. This is where data science and analysis come into the picture. In many cases, the tests that have been around for decades can't keep up with the sheer volume and magnitude of the data sets we have available

Objectives

Methods

Findings

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Engineering & Technology	Publication Date: Dec 18, 2020
Citations: 2	License type: cc-by

R Discovery Prime

R Discovery Prime

Data analysis using representation theory and clustering algorithms

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Engineering & Technology

Lead the way for us

Similar Papers

Data Analysis Using Representation Theory and Clustering Algorithms
Suboh Alkhushayni ... Taeyoung Choi
WSEAS TRANSACTIONS ON COMPUTERS | VOL. 19
Suboh Alkhushayni, et. al.Suboh Alkhushayni ... Taeyoung Choi
01 Mar 2021
WSEAS TRANSACTIONS ON COMPUTERS | VOL. 19

Penerapan Metode AHC (Agglomerative Hierarchical Clustering) untuk Klasifikasi Habitat Bentik di Desa pengudang, Kabupaten Bintan
Ulfatul Syahara ... Mario Putra Suhana
INSOLOGI: Jurnal Sains dan Teknologi | VOL. 3
Ulfatul Syahara, et. al. Ulfatul Syahara ... Mario Putra Suhana
28 Jun 2024
INSOLOGI: Jurnal Sains dan Teknologi | VOL. 3

Rapid Prototyping of Hierarchical Agglomerative Clustering Algorithms for Distributed Systems
Saiyedul Islam ... Kirti Singh Rathore
-
Saiyedul Islam, et. al.Saiyedul Islam ... Kirti Singh Rathore
01 Dec 2019
01 Dec 2019

A new validity clustering index-based on finding new centroid positions using the mean of clustered data to determine the optimum number of clusters
Ahmed Khaldoon Abdalameer ... Nor Ashidi Mat Isa
Expert Systems with Applications | VOL. 191
Ahmed Khaldoon Abdalameer, et. al.Ahmed Khaldoon Abdalameer ... Nor Ashidi Mat Isa
05 Dec 2021
Expert Systems with Applications | VOL. 191

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Data analysis using representation theory and clustering algorithms

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Engineering &amp; Technology

More From: International Journal of Engineering & Technology