Evaluation of Machine Learning Frameworks on Bank Marketing and Higgs Datasets

Bhuvan M Shashidhara,Nagamma Patil,G.S Raghavendra,Vinay D Rao,Siddharth Jain

doi:10.1109/icacce.2015.31

Abstract

Big data is an emerging field with different datasets of various sizes are being analyzed for potential applications. In parallel, many frameworks are being introduced where these datasets can be fed into machine learning algorithms. Though some experiments have been done to compare different machine learning algorithms on different data, these experiments have not been tested out on different platforms. Our research aims to compare two selected machine learning algorithms on data sets of different sizes deployed on different platforms like Weka, Scikit-Learn and Apache Spark. They are evaluated based on Training time, Accuracy and Root mean squared error. This comparison helps us to decide what platform is best suited to work while applying computationally expensive selected machine learning algorithms on a particular size of data. Experiments suggested that Scikit-Learn would be optimal on data which can fit into memory. While working with huge, data Apache Spark would be optimal as it performs parallel computations by distributing the data over a cluster. Hence this study concludes that spark platform which has growing support for parallel implementation of machine learning algorithms could be optimal to analyze big data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Evaluation of Machine Learning Frameworks on Bank Marketing and Higgs Datasets

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Utilization of synthetic minority oversampling technique for improving potato yield prediction using remote sensing data and machine learning algorithms with small sample size of yield data
Hamid Ebrahimy ... Zhou Zhang
ISPRS Journal of Photogrammetry and Remote Sensing | VOL. 201
Hamid Ebrahimy, et. al.Hamid Ebrahimy ... Zhou Zhang
24 May 2023
ISPRS Journal of Photogrammetry and Remote Sensing | VOL. 201

Effective Selection of Machine Learning Algorithms for Big Data Analytics Using Apache Spark
Manar Mohamed Hafez ... Abd El Ftah Abdel Ghfar Hegazy
-
Manar Mohamed Hafez, et. al.Manar Mohamed Hafez ... Abd El Ftah Abdel Ghfar Hegazy
18 Oct 2016
18 Oct 2016

Hybrid meta-heuristic and machine learning algorithms for tunneling-induced settlement prediction: A comparative study
Pin Zhang ... Tommy H.T Chan
Tunnelling and Underground Space Technology | VOL. 99
Pin Zhang, et. al.Pin Zhang ... Tommy H.T Chan
20 Mar 2020
Tunnelling and Underground Space Technology | VOL. 99

Internet Traffic Classification Using Machine Learning
Li Jun ... Zhang Shunyi
-
Li Jun, et. al. Li Jun ... Zhang Shunyi
01 Aug 2007
01 Aug 2007

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Evaluation of Machine Learning Frameworks on Bank Marketing and Higgs Datasets

Abstract

Talk to us

Similar Papers