Abstract
In a Grid computing environment, several applications such as scientific data analysis and visualization are naturally computation and communication intensive. These applications can be decomposed into a sequence of pipeline stages which can be placed on different Grid nodes for concurrent execution. Due to the aggregation of the computation and communication costs involved, finding the way to place such pipeline stages on a Grid in order to achieve the maximum application throughput becomes a challenging problem. This paper proposes a solution that considers both the pipeline placement and the data movement between stages. Specifically, we try to minimize the computation cost of the pipeline stages while preventing the communication overhead between successive stages from dominating the entire processing time. Our proposed solution consists of two novel methods. The first method is single path pipeline execution, which exploits only temporal parallelism, and the second method is multipath pipeline execution, which considers both temporal and spatial parallelism inherent in any pipeline applications. We evaluate our work in a simulated environment and also conduct a set of experiments in a real Grid computing system. When compared with the results from several traditional placement methods, our proposed methods give the highest throughput.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.