Machine Learning and Deep Learning frameworks and libraries for large-scale data mining: a survey

Giang Nguyen,Viet Tran,Ignacio Heredia,Stefan Dlugolinsky,Martin Bobák,Peter Malík,Álvaro López García,Ladislav Hluchý

doi:10.1007/s10462-018-09679-z

Abstract

The combined impact of new computing resources and techniques with an increasing avalanche of large datasets, is transforming many research areas and may lead to technological breakthroughs that can be used by billions of people. In the recent years, Machine Learning and especially its subfield Deep Learning have seen impressive advances. Techniques developed within these two fields are now able to analyze and learn from huge amounts of real world examples in a disparate formats. While the number of Machine Learning algorithms is extensive and growing, their implementations through frameworks and libraries is also extensive and growing too. The software development in this field is fast paced with a large number of open-source software coming from the academy, industry, start-ups or wider open-source communities. This survey presents a recent time-slide comprehensive overview with comparisons as well as trends in development and usage of cutting-edge Artificial Intelligence software. It also provides an overview of massive parallelism support that is capable of scaling computation effectively and efficiently in the era of Big Data.

Highlights

Data mining (DM) is the core stage of the knowledge discovery process that aims to extract interesting and potentially useful information from data (Goodfellow et al 2016; Mierswa 2017)
There are many methods that can be applied for model evaluation, such as cross-validation, kfold, holdout with various metrics such as accuracy (ACC), precision, recall, F1, Matthews correlation coefficient (MCC), receiver operating characteristic (ROC), area under the curve (AUC), mean absolute error (MAE), mean squared error (MSE), and root-mean-square error (RMSE)
The Machine Learning (ML) group at National Taiwan University provides support for MPI LibLinear, which is an extension of LibLinear for distributed environments and for Spark LibLinear, which is Spark implementation based on LibLinear and integrated with Hadoop distributed file system (NTU 2018)

Summary

Introduction

Data mining (DM) is the core stage of the knowledge discovery process that aims to extract interesting and potentially useful information from data (Goodfellow et al 2016; Mierswa 2017). The surge of large Volume of information, especially with the Variety characteristic, to be processed by data mining and ML algorithms demand new transformative parallel and distributed computing solutions capable to scale computation effectively and efficiently (Cano 2018) In this context, this survey presents a comprehensive overview with comparisons as well as trends in development and usage of cutting-edge AI software, libraries and frameworks, which are able to learn and adapt from previous experience using ML and DL techniques to perform more accurate and more effective operations for problem solving (Rouse 2018).

Machine Learning process

Neural Networks and Deep Learning

Accelerated computing

Machine Learning frameworks and libraries

RapidMiner

Scikit-Learn

LibSVM

LibLinear

Vowpal Wabbit

XGBoost

Interactive data analytic and visualization tools

4.1.10 Other data analytic frameworks and libraries

Deep Learning frameworks and libraries

TensorFlow

Chollet

Microsoft CNTK

Caffe2

PyTorch

Chainer

4.2.10 Theano

4.2.11 Performance-wise preliminary

4.2.12 Deep Learning wrapper libraries

Machine Learning and Deep Learning frameworks and libraries with MapReduce

Deeplearning4j

Apache Spark MLlib and Spark ML

Other frameworks and libraries with MapReduce

Findings

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Artificial Intelligence Review	Publication Date: Jan 19, 2019
Citations: 463	License type: open-access

R Discovery Prime

R Discovery Prime

Machine Learning and Deep Learning frameworks and libraries for large-scale data mining: a survey

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Artificial Intelligence Review

Lead the way for us

Similar Papers

Predictive Modeling of Surface Roughness Using Machine and Deep Learning Frameworks from Experimental Data of Chemically Etched Polished Silicon Wafer with DDMAF
Ayush Kumar Singh ... Kheelraj Pandey
-
Ayush Kumar Singh, et. al.Ayush Kumar Singh ... Kheelraj Pandey
09 Sep 2022
09 Sep 2022

Machine and Deep Learning (ML/DL) Algorithms, Frameworks, and Libraries
Savan Patel ... Jigna Bhupendra Prajapati
-
Savan Patel, et. al.Savan Patel ... Jigna Bhupendra Prajapati
13 Jan 2023
13 Jan 2023

Understanding Software-2.0
Danny Dig ... Ameya Ketkar
ACM Transactions on Software Engineering and Methodology | VOL. 30
Danny Dig, et. al.Danny Dig ... Ameya Ketkar
23 Jul 2021
ACM Transactions on Software Engineering and Methodology | VOL. 30

Various Frameworks and Libraries of Machine Learning and Deep Learning: A Survey
Ke Liu ... Jian Li
Archives of Computational Methods in Engineering | VOL. 31
Ke Liu, et. al.Ke Liu ... Jian Li
01 Feb 2019
Archives of Computational Methods in Engineering | VOL. 31

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Machine Learning and Deep Learning frameworks and libraries for large-scale data mining: a survey

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Artificial Intelligence Review