Abstract

There is a clear trend in current processor design towards the combination of several thread level parallelism paradigms on the same chip, exemplified by processors such as the IBM POWER7. In those processors, the way threads are assigned to different hardware contexts, denoted thread placement, plays a key role in improving overall performance. In this paper we analyze the thread placement problem in the IBM POWER7 processor. Under each thread placement setup we analyze in detail how hardware resources are shared among running threads. We show to which extent a software designer can characterize an application on the POWER7 and based on that characterization, select the best thread placement configuration to improve a target metric. Our results show that a 54% reduction in execution time can be obtained (11.2% on average) when running pairs of desktop parallel applications under the appropriate thread placement.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call