Automatic Selection and Parameter Configuration of Big Data Software Core Components Based on Retention Pattern

Ping Xu

doi:10.1155/2021/6667275

Abstract

This paper conducts an in-depth analysis and research on the automatic selection and parameter configuration of the core components of Big Data software by using the retention model and the automatic selection of Big Data components by establishing a standardized requirement index and using the decision tree model to solve the problem of component selection in Big Data application development. By establishing standardized demand indicators and based on the retention model, a data transmission intermediate platform for bidirectional data detection is proposed based on the three demands of user input: storage, computation, and analysis, as well as the problem of undetectable packet loss in data transmission of existing IoT and Web service platforms. The data communication module of the data transmission intermediate platform enables mutual monitoring and detection of data interaction between IoT smart terminals and cloud platforms. The retention mode is built separately to realize the automatic selection of Big Data components. In this paper, we start from several mainstream distributed storage systems and use Cassandra as an example for experiments and tests. We use the multiple regression fitting method to build a corresponding performance model for hardware parameters, take user requirements as input, and use the performance model to configure system hardware parameters; by studying its system principle, architecture, features, and application scenarios, we build a software parameter configuration knowledge base to guide the software. This solves the difficult problem of selecting, deploying, and configuring parameters for Big Data applications.

Highlights

Big Data technology is no longer unfamiliar to us, and applications of Big Data technology are everywhere
Analysis of Automatic Selection and Parameter Configuration Results. e experiment tests the maximum performance that can be achieved by continuously increasing the number of client threads. e test uses 10 columns per row, with an average of 10 characters per column
For storage and compute components, the selection of storage and computer systems should be output at the same time; if the user selects analysis requirements, the selection of storage, compute, and analysis systems should be output at the same time. e algorithm gives a pseudocode for the component selection process

Summary

Introduction

Big Data technology is no longer unfamiliar to us, and applications of Big Data technology are everywhere. How to use the value of Big Data more effectively to serve us has become the direction of a malefactor for many people [1]. If we want to take advantage of the value of Big Data, we must process the data. Common Big Data task processing processes include data decompression, data cleansing, data loading, data conversion, and data backup [2]. Scheduling systems take advantage of these interdependencies to schedule these tasks, automatically scheduling each job according to their interdependencies to reduce manual operations [3]. A Big Data application scheduling system can realize the scheduling of these simple tasks and the scheduling management of complex Big Data tasks [4]

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Mathematical Problems in Engineering	Publication Date: Jan 22, 2021
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Automatic Selection and Parameter Configuration of Big Data Software Core Components Based on Retention Pattern

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematical Problems in Engineering

Lead the way for us

Similar Papers

Explore Big Data Analytics Applications and Opportunities: A Review
Zaher Ali Al-Sai ... Sharifah Mashita Syed-Mohamad
Big Data and Cognitive Computing | VOL. 6
Zaher Ali Al-Sai, et. al.Zaher Ali Al-Sai ... Sharifah Mashita Syed-Mohamad
14 Dec 2022
Big Data and Cognitive Computing | VOL. 6

Business model canvas perspective on big data applications
F Canan Pembe Muhtaroglu ... Murat Obali
-
F Canan Pembe Muhtaroglu, et. al.F Canan Pembe Muhtaroglu ... Murat Obali
01 Oct 2013
01 Oct 2013

Managing Cloud-Based Big Data Platforms: A Reference Architecture and Cost Perspective
Leonard Heilig ... Stefan Voß
-
Leonard Heilig, et. al.Leonard Heilig ... Stefan Voß
17 Nov 2016
17 Nov 2016

Scaling Big Data Applications in Smart City with Coresets
Barbora Buhnova ... Mouzhi Ge
-
Barbora Buhnova, et. al.Barbora Buhnova ... Mouzhi Ge
01 Jan 2019
01 Jan 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Automatic Selection and Parameter Configuration of Big Data Software Core Components Based on Retention Pattern

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematical Problems in Engineering