Developing a big data analytics platform using Apache Hadoop Ecosystem for delivering big data services in libraries

Ranjeet Kumar Singh

doi:10.1108/dlp-10-2022-0079

Abstract

PurposeAlthough the challenges associated with big data are increasing, the question of the most suitable big data analytics (BDA) platform in libraries is always significant. The purpose of this study is to propose a solution to this problem.Design/methodology/approachThe current study identifies relevant literature and provides a review of big data adoption in libraries. It also presents a step-by-step guide for the development of a BDA platform using the Apache Hadoop Ecosystem. To test the system, an analysis of library big data using Apache Pig, which is a tool from the Apache Hadoop Ecosystem, was performed. It establishes the effectiveness of Apache Hadoop Ecosystem as a powerful BDA solution in libraries.FindingsIt can be inferred from the literature that libraries and librarians have not taken the possibility of big data services in libraries very seriously. Also, the literature suggests that there is no significant effort made to establish any BDA architecture in libraries. This study establishes the Apache Hadoop Ecosystem as a possible solution for delivering BDA services in libraries.Research limitations/implicationsThe present work suggests adapting the idea of providing various big data services in a library by developing a BDA platform, for instance, providing assistance to the researchers in understanding the big data, cleaning and curation of big data by skilled and experienced data managers and providing the infrastructural support to store, process, manage, analyze and visualize the big data.Practical implicationsThe study concludes that Apache Hadoops’ Hadoop Distributed File System and MapReduce components significantly reduce the complexities of big data storage and processing, respectively, and Apache Pig, using Pig Latin scripting language, is very efficient in processing big data and responding to queries with a quick response time.Originality/valueAccording to the study, there are significantly fewer efforts made to analyze big data from libraries. Furthermore, it has been discovered that acceptance of the Apache Hadoop Ecosystem as a solution to big data problems in libraries are not widely discussed in the literature, although Apache Hadoop is regarded as one of the best frameworks for big data handling.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Developing a big data analytics platform using Apache Hadoop Ecosystem for delivering big data services in libraries

Abstract

Talk to us

Similar Papers

More From: Digital Library Perspectives

Lead the way for us

Journal: Digital Library Perspectives	Publication Date: Feb 22, 2024
Citations: 2

Similar Papers

Design and Research of Big Data Collection and Analysis Platform Based on Cloud Computing
Xuan Pei ... Xiaoying Ren
IOP Conference Series: Materials Science and Engineering | VOL. 677
Xuan Pei, et. al.Xuan Pei ... Xiaoying Ren
01 Dec 2019
IOP Conference Series: Materials Science and Engineering | VOL. 677

The growing role of integrated and insightful big and real-time data analytics platforms
Ranganathan Indrakumari ... Balusamy Balamurugan
-
Ranganathan Indrakumari, et. al.Ranganathan Indrakumari ... Balusamy Balamurugan
21 Nov 2019
21 Nov 2019

LEO IoT based big data management and analysis platform design for intermodal containers
Jieyin Lyu ... Xiangmo Zhao
IOP Conference Series: Materials Science and Engineering | VOL. 715
Jieyin Lyu, et. al.Jieyin Lyu ... Xiangmo Zhao
01 Jan 2020
IOP Conference Series: Materials Science and Engineering | VOL. 715

Grasping Popular Applications in Cellular Networks With Big Data Analytics Platforms
Pierdomenico Fiadino ... Arian Baer
IEEE Transactions on Network and Service Management | VOL. 13
Pierdomenico Fiadino, et. al.Pierdomenico Fiadino ... Arian Baer
01 Sep 2016
IEEE Transactions on Network and Service Management | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Developing a big data analytics platform using Apache Hadoop Ecosystem for delivering big data services in libraries

Abstract

Talk to us

Similar Papers

More From: Digital Library Perspectives