Abstract
Data mining is the process of extracting informative and useful rules or relations, that can be used to make predictions about the values of new instances, from existing data. A wide range of commercial and open source software programs are used for data mining. In this study, a comparison of several classification algorithms included in some open source softwares such as WEKA, Tanagra and Scikit-learn using SEER (Survillance Epidemiology and End Results) data set which consists of 60948 instances is performed. Key words: Data mining, classification analysis, open source data mining tools.  
Highlights
A wide range of algorithms can be used to extract information from data for data mining purposes
Weka is an open source data mining tool developed at Waikato University
Error rate 12.74 16.48 19.49 13.38 12.25 17.95 18.55 achieved with Weka is obtained with KStar with an accuracy of 85.44%, followed by J48 algorithm with an accuracy of 84.23%
Summary
A wide range of algorithms can be used to extract information from data for data mining purposes. There are many different algorithms used for data mining. Because of its application on a wide range of different areas, studies in data mining technology are going on continously, new methods are being developed and enhancements to the existing ones are taking place continously. A comparison of several classification algorithms in data mining on health data has been conducted.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: African Journal of Mathematics and Computer Science Research
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.