Abstract

The growing interest in the extraction of useful knowledge from data with the aim of being beneficial for the data owner is giving rise to multiple data mining tools. Research community is specially aware of the importance of open source data mining software to ensure and ease the dissemination of novel data mining algorithms. The availability of these tools at no cost, and also the chance of better understanding of the approaches by examining their source code, provides the research community with an opportunity to tune and improve the algorithms. Documentation, updating, variety of algorithms, extensibility, and interoperability among others can be major issues to motivate users for opting for a specific open source data mining tool. The aim of this paper is to evaluate 19 open source data mining tools and to provide the research community with an extensive study based on a wide set of features that any tool should satisfy. The evaluation is carried out by following two methodologies. The first one is based on scores provided by experts to produce a subjective judgment of each tool. The second procedure performs an objective analysis about which features are satisfied by each tool. The ultimate aim of this work is to provide the research community with an extensive study on different features included in any data mining tool, either from a subjective and an objective point of view. Results reveal that RapidMiner, Konstanz Information Miner, and Waikato Environment for Knowledge Analysis are the tools that include higher percentage of these features. WIREs Data Mining Knowl Discov 2017, 7:e1204. doi: 10.1002/widm.1204This article is categorized under: Application Areas > Data Mining Software Tools Technologies > Computer Architectures for Data Mining

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call