TMTLS: Combine TM with TLS to Limit the Memory Contentions and Exploit the Parallelism in the Long-Running Transactions

Zhichao Yan,Dan Feng,Yujuan Tan

doi:10.1109/nas.2011.35

Abstract

As more threads added to execute the multi-threaded applications in the many-core era, memory contentions among different threads impose a severe challenge to both the programmability and performance. Existing studies show that Transactional Memory (TM) is able to solve the programmability problem and scale well on the fine-grained applications in the SPLASH-2 benchmark suite. As more investigations on the coarse-grained applications in the STAMP benchmark suite, the long-running transactions block the parallelism among the concurrent transactions and failed to obtain the performance returns when the number of threads is beyond 4. In order to address this problem, we propose TMTLS, which combines TM with Thread-Level Speculation (TLS) to limit the number of concurrent executing transactions due to the memory contention in the runtime, divides the coarse-grained transactions into several epochs and assigns them to the available threads to speculatively exploit the parallelism in the coarse-grained transactions. This proposal not only alleviates the memory contention among the threads but also shortens the execution period of the coarse-grained transactions. Moreover, it further reduces the serializing overheads due to the transactional conflicts among the transactions. Our evaluation show this method achieves an average speedup of 2.27 over the baseline TM system under the 4 high-contention and coarse-grained applications selected from the STAMP benchmark suite on a 16-core CMP.

Full Text