Abstract
The cache ping-pong problem arises often in parallel processing systems where each processor has its own local cache and employs a copy-back protocol for the cache coherence. To solve the problem of large amounts of data moving back and forth between the caches in different processors, techniques associated with parallel compilers need to be developed. Based on the concept in [Fang, J. Z., Proc. International Conference on Parallel Processing, Aug. 1990, pp. II-271–II-275] regarding the relations between array element accesses and enclosed loop indices in nested parallel loops, we present an algorithm in this paper to reduce the unnecessary data movement between the caches for parallel loops with multiple array subscript expressions. By analyzing the array subscript expressions in the nested parallel loop constructs, the compilers can use the algorithm to prepare information at compile time and let the processor execute the corresponding iterations of parallel loops in terms of the data in its cache. It benefits the parallel programs in which parallel loops are enclosed by a sequential loop and have multiple different subscript expressions for the same array, whose elements are repeatedly used in the different iterations of the outermost sequential loop.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.