Abstract

In this paper, an innovative strategy for the data-flow synchronization in shared-memory systems is proposed. This strategy assumes to synchronize only interdependent threads instead of using the barrier approach that—in contrast to our approach—synchronize all threads. We demonstrate the adaptation of the data-flow synchronization strategy to two complex scientific applications based on stencil codes. An algorithm for the data-flow synchronization is developed and successfully used for both applications. The proposed approach is evaluated for various Intel microarchitectures released in the last 5 years, including the newest processors: Skylake and Knights Landing. The important part of this assessment is the performance comparison of the proposed data-flow synchronization with the OpenMP barrier. The experimental results show that the performance of the studied applications can be accelerated up to 1.3 times using the proposed data-flow synchronizations strategy.

Highlights

  • The huge capacity of modern HPC platforms allows complex problems, previously thought impossible, to be solved [12]

  • We propose an innovative strategy for the data-flow synchronization in shared-memory systems

  • Since only the adjacent threads depend on each other, we propose to perform the synchronization inside every couple of threads instead of using the barrier approach

Read more

Summary

Introduction

The huge capacity of modern HPC platforms allows complex problems, previously thought impossible, to be solved [12]. The EULAG model is an innovative solver in the field of numerical modeling of multiscale geophysical flows Another application area tackled in the work refers to the phase-field method, which is a powerful tool for solving interfacial problems in materials science [11]. The synchronization strategies that base on data-flow communication layers are very popular in distributed-memory programming standards, including MPI or hStreams programming library [5]. The synchronization between the interdependent processing elements is explicitly defined according to communication flows of data, using the specific commands such as MPI_Send and MPI_Recv in the case of MPI This is achieved on a totally different level of programming abstraction than in the proposed approach

Strategy for data-flow synchronization in stencils
Adaptation of data-flow synchronization strategy to MPDATA
Experimental results
Conclusions and future work

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.