Distributed Programming Model Research Articles

In this paper, we present a generic, scalable and adaptive load balancing parallel Lagrangian particle tracking approach in Wiener type processes such as Brownian motion. The approach is particularly suitable in problems involving particles with highly variable computation time, like deposition on boundaries that may include decay, when particle lifetime obeys exponential distribution. At first glance, Lagranginan tracking is highly suitable for a distributed programming model due to the independence of motion of separate particles. However, the commonly employed Decomposition Per Particle (DPP) method, where each process is in charge of a certain number of particles, actually displays poor parallel efficiency due to the high particle lifetime variability when dealing with a wide set of deposition problems that optionally include decay. The proposed method removes DPP defects and brings a novel approach to discrete particle tracking. The algorithm introduces master/slave model dubbed Partial Trajectory Decomposition (PTD), in which a certain number of processes produce partial trajectories and put them into the shared queue, while the remaining processes simulate actual particle motion using previously generated partial trajectories. Our approach also introduces meta-heuristics for determining the optimal values of partial trajectory length, chunk size and the number of processes acting as producers/consumers, for the given total number of participating processes (Optimized Partial Trajectory Decomposition, OPTD). The optimization process employs a surrogate model to estimate the simulation time. The surrogate is based on historical data and uses a coupled machine learning model, consisting of classification and regression phases. OPTD was implemented in C, using standard MPI for message passing and benchmarked on a model of 220 Rn progeny in the diffusion chamber, where particle motion is characterized by an exponential lifetime distribution and Maxwell velocity distribution. The speedup improvement of OPTD is approximatelly 320% over standard DPP, reaching almost ideal speedup on up to 256 CPUs.

The discovery of item sets with high utility like profits is referred by mining high utility item sets from a transactional database. Although in recent years a number of relevant algorithms have been proposed, for high utility item sets the problem of producing a large number of candidate item sets is incurred. The mining performance is degraded by such a large number of candidate item sets in terms of execution time and space requirement. When the database contains lots of long transactions or long high utility item sets the situation may become worse. Internet purchasing and transactions is increased in recent years, mining of high utility item sets especially from the big transactional databases is required task to process many day to day operations in quick time. There are many methods presented for mining the high utility item sets from large transactional datasets are subjected to some serious limitations such as performance of this methods needs to be investigated in low memory based systems for mining high utility itemsets from large transactional datasets and hence needs to address further as well. Another limitation is these proposed methods cannot overcome the screenings as well as overhead of null transactions; hence, performance degrades drastically. During this paper, we are presenting the new approach to overcome these limitations. We presented distributed programming model for mining business-oriented transactional datasets by using an improved Map Reduce framework on Hadoop, which overcomes not only the single processor and main memory-based computing, but also highly scalable in terms of increasing database size. We have used this approach with existing UP-Growth and UP-Growth+ with aim of improving their performances further. In experimental studies we will compare the performances of existing algorithms UP-Growth and UP-Growth+ against the improve UP-Growth and UP-Growth+ with Hadoop.

Distributed Programming Model Research Articles

Related Topics

Articles published on Distributed Programming Model

Comparative Analysis of Skew-Join Strategies for Large-Scale Datasets with MapReduce and Spark

Data Analysis Method of Intelligent Analysis Platform for Big Data of Film and Television

DisCANTree: A Distributed Algorithm for Incremental Frequent Itemset Mining based on MapReduce

Programmable Logic Controllers in the Context of Industry 4.0

Optimizing parallel particle tracking in Brownian motion using machine learning

Parallel Computation of Rough Set Approximations in Information Systems with Missing Decision Data

The MapReduce Model on Cascading Platform for Frequent Itemset Mining

Distributing relational model transformation on MapReduce

A strategy to load balancing for non-connectivity MapReduce job

MixHeter: A global scheduler for mixed workloads in heterogeneous environments

SparkBench: a spark benchmarking suite characterizing large-scale in-memory data analytics

Efficient Pairwise Document Similarity Computation in Big Datasets

Management of Possible Roles for Distributed Software Projects Using Layer Architecture

English

Performance Comparison of OpenMP, MPI, and MapReduce in Practical Problems

Performance Improvement of MapReduce Framework in Heterogeneous Context using Reinforcement Learning

The Optimization and Improvement of MapReduce in Web Data Mining

Triolet

Utility Mining Algorithm for High Utility Item sets from Transactional Databases

Graph-based parallel distributed genetic programming model

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Distributed Programming Model Research Articles

Related Topics

Articles published on Distributed Programming Model

Comparative Analysis of Skew-Join Strategies for Large-Scale Datasets with MapReduce and Spark

Data Analysis Method of Intelligent Analysis Platform for Big Data of Film and Television

DisCANTree: A Distributed Algorithm for Incremental Frequent Itemset Mining based on MapReduce

Programmable Logic Controllers in the Context of Industry 4.0

Optimizing parallel particle tracking in Brownian motion using machine learning

Parallel Computation of Rough Set Approximations in Information Systems with Missing Decision Data

The MapReduce Model on Cascading Platform for Frequent Itemset Mining

Distributing relational model transformation on MapReduce

A strategy to load balancing for non-connectivity MapReduce job

MixHeter: A global scheduler for mixed workloads in heterogeneous environments

SparkBench: a spark benchmarking suite characterizing large-scale in-memory data analytics

Efficient Pairwise Document Similarity Computation in Big Datasets

Management of Possible Roles for Distributed Software Projects Using Layer Architecture

English

Performance Comparison of OpenMP, MPI, and MapReduce in Practical Problems

Performance Improvement of MapReduce Framework in Heterogeneous Context using Reinforcement Learning

The Optimization and Improvement of MapReduce in Web Data Mining

Triolet

Utility Mining Algorithm for High Utility Item sets from Transactional Databases

Graph-based parallel distributed genetic programming model