Parallel Algorithm for Reduction of Data Processing Time in Big Data

Amelec (Jesus) Viloria (Silva) ,Hugo Hernández Palma,William Niebles Núñez,Noel Varela,David Ovallos-Gazabon

doi:10.1088/1742-6596/1432/1/012095

Abstract

Technological advances have allowed to collect and store large volumes of data over the years. Besides, it is significant that today’s applications have high performance and can analyze these large datasets effectively. Today, it remains a challenge for data mining to make its algorithms and applications equally efficient in the need of increasing data size and dimensionality [1]. To achieve this goal, many applications rely on parallelism, because it is an area that allows the reduction of cost depending on the execution time of the algorithms because it takes advantage of the characteristics of current computer architectures to run several processes concurrently [2]. This paper proposes a parallel version of the FuzzyPred algorithm based on the amount of data that can be processed within each of the processing threads, synchronously and independently.

Highlights

FuzzyPred is a data mining method that allows the extraction of fuzzy predicates in normal conjunctive and disjunctive form [3] [4]
This paper proposes a parallel version of the FuzzyPred algorithm based on the amount of data that can be processed within each of the processing threads, synchronously and independently
Because parallel computing must be exploited to solve data mining problems, this paper presents a parallel version of FuzzyPred with the purpose of reducing runtime

Summary

Introduction

FuzzyPred is a data mining method that allows the extraction of fuzzy predicates in normal conjunctive and disjunctive form [3] [4]. This method is modeled as a problem of combinatorial optimization because the space of solutions to travel can become very large. Each generated solution (or predicate) is sequentially evaluated in each of the database records. Considering the above, and due to the fact that the dimensions and the number of variables of the current databases increase in size every day, it is possible to obtain high response times in this process by using FuzzyPred [5]. Experiments are performed to compare the sequential version with the parallel version of FuzzyPred, in different performance metrics ( acceleration and efficiency)

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Parallel Algorithm for Reduction of Data Processing Time in Big Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Physics: Conference Series

Lead the way for us

Journal: Journal of Physics: Conference Series	Publication Date: Jan 1, 2020
License type: cc-by

Similar Papers

Pharmacy: Harnessing The Power Of Big Data
Amy K Erickson
Pharmacy Today | VOL. 20
Amy K EricksonAmy K Erickson
01 Nov 2014
Pharmacy Today | VOL. 20

Big data analytics in Industry 4.0 ecosystems
Gagangeet Singh Aujla ... Radu Prodan
Software: Practice and Experience | VOL. 52
Gagangeet Singh Aujla, et. al.Gagangeet Singh Aujla ... Radu Prodan
11 Jun 2021
Software: Practice and Experience | VOL. 52

Function Modeling Improves the Efficiency of Spatial Modeling Using Big Data from Remote Sensing
John Hogland ... Nathaniel Anderson
Big Data and Cognitive Computing | VOL. 1
John Hogland, et. al.John Hogland ... Nathaniel Anderson
13 Jul 2017
Big Data and Cognitive Computing | VOL. 1

Effect of Corpus Size Selection on Performance of Map-Reduce Based Distributed K-Means for Big Textual Data Clustering
Shwet Ketu ... Bakshi Rohit Prasad
-
Shwet Ketu, et. al.Shwet Ketu ... Bakshi Rohit Prasad
25 Sep 2015
25 Sep 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Parallel Algorithm for Reduction of Data Processing Time in Big Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Physics: Conference Series