Abstract

The availability of digital technology in the hands of every citizenry worldwide makes an available unprecedented massive amount of data. The capability to process these gigantic amounts of data in real-time with Big Data Analytics (BDA) tools and Machine Learning (ML) algorithms carries many paybacks. However, the high number of free BDA tools, platforms, and data mining tools makes it challenging to select the appropriate one for the right task. This paper presents a comprehensive mini-literature review of ML in BDA, using a keyword search; a total of 1512 published articles was identified. The articles were screened to 140 based on the study proposed novel taxonomy. The study outcome shows that deep neural networks (15%), support vector machines (15%), artificial neural networks (14%), decision trees (12%), and ensemble learning techniques (11%) are widely applied in BDA. The related applications fields, challenges, and most importantly the openings for future research, are detailed.

Highlights

  • Huge volumes of data are being generated every day in a variety of fields, from social networks to engineering and commerce to biomolecular research and phycology[1,2]

  • Based on the ontology proposed in Sun et al.[5], this study grouped the type of big data analytics into three, namely, (i) Big Data (BD) descriptive analytics, (ii) BD predictive analytics, and (iii) BD prescriptive analytics

  • The results suggest that high impact publication houses have seen the need to make available big data analytics with Machine Learning (ML) applications to the scientific community

Read more

Summary

Introduction

Huge volumes of data are being generated every day in a variety of fields, from social networks to engineering and commerce to biomolecular research and phycology[1,2]. Digital data generated from various digital platforms and devices are growing at astounding rates worldwide. In 2011, digital information grew nine times in volume compared with 2006, and it is estimated to reach 44 zettabytes by 2020[1,3]. As of 16th December 2020, the volume of daily generated data globally was 59 zettabytes. It is anticipated to reach 149 zettabytes[4] in 2024 as we go into an even more data-driven future. The escalating volume in data is the principal attribute of “big data”, a jargon that has become a household name in the research communities, organisations, and the Internet

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.