Abstract

With the growth of data mining and machine learning approaches in recent years, many efforts have been made to generalize these sciences so that researchers from any field can easily utilize these sciences. One of the most important of these efforts is the development of data mining tools that try to hide the complexities from researchers so that they can achieve a professional output with any level of knowledge. This paper is focused on reviewing and comparing data mining and machine learning tools including WEKA, KNIME, Keel, Orange, Azure, IBM SPSS Modeler, R and Scikit-Learn to show what approach each of these methods has taken in the face of the complexities and problems of different scenarios of generalization of data mining and machine learning. In addition, for a more detailed review, this paper examines the challenge of network intrusion detection in two tools, Knime with graphical interface and Scikit-Learn with coding environment.

Highlights

  • The growth and penetration of the Internet has led to the production of large amounts of data by companies

  • Machine learning algorithms can be divided into four categories [8]: Supervised learning In supervised learning, a set of samples with their labels is given to the machine and the machine should find a relationship between the samples and their labels

  • We compared a number of popular Knowledge Discovery from Data (KDD) tools such as WEKA, KNIME, KEEL, Orange, Azure, IBM SPSS Modeler, and R tools in terms of platforms, features, and algorithms

Read more

Summary

Introduction

The growth and penetration of the Internet has led to the production of large amounts of data by companies. Multimedia Tools and Applications (2021) 80:4999–5019 useful information from these data These researches are very valuable for companies and as a result, have led to the growth of data mining and machine learning technologies. The data mining tools use historical information to build a model to predict customer’s behavior e.g., which customers are likely to respond to a new product. Another example is intrusion detection in local systems or networks by analyzing the activity of system and network and processes them by the data mining algorithm in data mining tools.

Related works
Data mining algorithms and scenarios supported by the tools
Support
Pre-processing variety
Learning variety
Advanced features
Case study
Intrusion detection challenge
NSL-KDD dataset
Selected tools
Preprocessing
Results
Conclusions
Compliance with ethical standards
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call