K-means variations analysis for translation of English Tafseer Al-Quran text

Mohammed A Ahmed,Puteri Nor Ellyza Nohuddin,Hanif Baharin

doi:10.11591/ijece.v13i3.pp3255-3265

Mohammed A Ahmed, Puteri Nor Ellyza Nohuddin + Show 1 more

Open Access

https://doi.org/10.11591/ijece.v13i3.pp3255-3265

Copy DOI

Abstract

Text mining is a powerful modern technique used to obtain interesting information from huge datasets. Text clustering is used to distinguish between documents that have the same themes or topics. The absence of the datasets ground truth enforces the use of clustering (unsupervised learning) rather than others, such as classification (supervised learning). The “no free lunch” (NFL) theorem supposed that no algorithm outperformed the other in a variety of conditions (several datasets). This study aims to analyze the k-means cluster algorithm variations (three algorithms (k-means, mini-batch k-means, and k-medoids) at the clustering process stage. Six datasets were used/analyzed in chapter Al-Baqarah English translation (text) of 286 verses at the preprocessing stage. Moreover, feature selection used the term frequency–inverse document frequency (TF-IDF) to get the weighting term. At the final stage, five internal cluster validations metrics were implemented silhouette coefficient (SC), Calinski-Harabasz index (CHI), C-index (CI), Dunn’s indices (DI) and Davies Bouldin index (DBI) and regarding execution time (ET). The experiments proved that k-medoids outperformed the other two algorithms in terms of ET only. In contrast, no algorithm is superior to the other in terms of the clustering process for the six datasets, which confirms the NFL theorem assumption.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

K-means variations analysis for translation of English Tafseer Al-Quran text

Abstract

Talk to us

Similar Papers

More From: International Journal of Electrical and Computer Engineering (IJECE)

Lead the way for us

Journal: International Journal of Electrical and Computer Engineering (IJECE)	Publication Date: Jun 1, 2023
License type: CC BY-SA 4.0

Similar Papers

Is Clustering Time-Series Water Depth Useful? An Exploratory Study for Flooding Detection in Urban Drainage Systems
Jiada Li ... Daniyal Hassan
Water | VOL. 12
Jiada Li, et. al.Jiada Li ... Daniyal Hassan
30 Aug 2020
Water | VOL. 12

Comparative Analysis of Three Methods for HYSPLIT Atmospheric Trajectories Clustering
Likai Cui ... Xiaoquan Song
Atmosphere | VOL. 12
Likai Cui, et. al.Likai Cui ... Xiaoquan Song
30 May 2021
Atmosphere | VOL. 12

Research on automotive scrap metal classification method using laser-induced breakdown spectroscopy and two-step clustering algorithm
Jingjun Lin ... Jiangfei Yang
Journal of Laser Applications | VOL. 36
Jingjun Lin, et. al.Jingjun Lin ... Jiangfei Yang
25 Mar 2024
Journal of Laser Applications | VOL. 36

An improved K-means clustering algorithm for global earthquake catalogs and earthquake magnitude prediction
Rui Yuan
Journal of Seismology | VOL. 25
Rui YuanRui Yuan
16 Mar 2021
Journal of Seismology | VOL. 25

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

K-means variations analysis for translation of English Tafseer Al-Quran text

Abstract

Talk to us

Similar Papers

More From: International Journal of Electrical and Computer Engineering (IJECE)