A Comparative Analysis of Classification Algorithms on Diverse Datasets

M Alghobiri

doi:10.48084/etasr.1952

Abstract

Data mining involves the computational process to find patterns from large data sets. Classification, one of the main domains of data mining, involves known structure generalizing to apply to a new dataset and predict its class. There are various classification algorithms being used to classify various data sets. They are based on different methods such as probability, decision tree, neural network, nearest neighbor, boolean and fuzzy logic, kernel-based etc. In this paper, we apply three diverse classification algorithms on ten datasets. The datasets have been selected based on their size and/or number and nature of attributes. Results have been discussed using some performance evaluation measures like precision, accuracy, F-measure, Kappa statistics, mean absolute error, relative absolute error, ROC Area etc. Comparative analysis has been carried out using the performance evaluation measures of accuracy, precision, and F-measure. We specify features and limitations of the classification algorithms for the diverse nature datasets.

Highlights

Due to the evolving of computer science and the fast development and vast usage of World Wide Web and other electronic data, information extraction is a popular research field
Selected Classification Algorithms There are numerous classification algorithms, but we have focused on algorithms of diverse nature, three different algorithms have been chosen
C4.5 is the famous algorithm that is based on the decision tree algorithm, whereas the Naïve Bayes is a probabilistic algorithm and the Support Vector Machine algorithm (SVM) is a kernel based algorithm

Summary

Introduction

Due to the evolving of computer science and the fast development and vast usage of World Wide Web and other electronic data, information extraction is a popular research field. Data mining [1, 2] is a significant method to extract information from data. Classification [3, 4] is one of the main domains of data mining and has extensively been used for various purposes like decision making, weather forecasting, prediction of customers’ attitude, prediction of various social risk analysis as well as official tasks, prediction of influential bloggers [5,6,7,8,9,10] etc. The first phase generates the classification model known as classifiers that depict the relationship between characteristics and classes. Most classifiers use probability calculations to make class labels, accuracy measure has not been a target. Naive Bayes and the C4.5 learning algorithm are alike in predictive accuracy [11,12,13]

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Engineering, Technology & Applied Science Research	Publication Date: Apr 19, 2018
Citations: 17	License type: cc-by

R Discovery Prime

R Discovery Prime

A Comparative Analysis of Classification Algorithms on Diverse Datasets

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Engineering, Technology & Applied Science Research

Lead the way for us

Similar Papers

Comparative Analysis of Classification Algorithms for Heart Disease
Shradha Solapure ... Prajit Thube
International Journal for Research in Applied Science and Engineering Technology | VOL. 11
Shradha Solapure, et. al.Shradha Solapure ... Prajit Thube
31 May 2023
International Journal for Research in Applied Science and Engineering Technology | VOL. 11

Comparative analysis of HAR datasets using classification algorithms
Suvra Nayak ... Meng-Yen Hsieh
Computer Science and Information Systems | VOL. 19
Suvra Nayak, et. al.Suvra Nayak ... Meng-Yen Hsieh
01 Jan 2021
Computer Science and Information Systems | VOL. 19

A Comparison of Software Defect Prediction Metrics Using Data Mining Algorithms
Zeynep Behrin Güven Aydin ... Rüya Şamli
Journal of Innovative Science and Engineering (JISE) | VOL. 4
Zeynep Behrin Güven Aydin, et. al.Zeynep Behrin Güven Aydin ... Rüya Şamli
14 May 2020
Journal of Innovative Science and Engineering (JISE) | VOL. 4

The comparative analysis on the accuracy of k-NN, Naive Bayes, and Decision Tree Algorithms in predicting crimes and criminal actions in Sleman Regency
A H Wibowo ... T I Oesman
Journal of Physics: Conference Series | VOL. 1450
A H Wibowo, et. al.A H Wibowo ... T I Oesman
01 Feb 2020
Journal of Physics: Conference Series | VOL. 1450

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Comparative Analysis of Classification Algorithms on Diverse Datasets

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Engineering, Technology &amp; Applied Science Research

More From: Engineering, Technology & Applied Science Research