Abstract
How to make use of multicore computing resources to accelerate high performance computing HPC applications has become a common concern problem. However, HPC applications have not yet been explored in thread level speculation TLS thoroughly, especially in the procedure level. This paper proposes a procedure and loop level speculation architecture model for speeding up HPC applications, including its speculative mechanism, analysis method, etc. It also takes several applications from SPLASH2 to analyse their speculative parallel potential together with performance impacting factors. The experimental results show that: 1 the best Barnes application can get a 90.9× speedup in loop level speculation while Lu application can get 40.2× speedup in procedure level speculation; 2 limited parallelism coverage and severe inter-thread data dependence violations badly affect both loop and procedure level speculative parallelism in some HPC applications; 3 It is found that although loop structure is the main source of speculative parallelism, procedure structure can be treated as its important supplement. Especially in applications that their 'hot' iteration body concludes multiple procedure calls, higher speculative procedure level speedup can be achieved than that in loop level speculation.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have