Abstract

Transactional Synchronization Extensions (TSX) support for hardware Transactional Memory (TM) on Intel 4th generation Core processors. Two programming interfaces, Hardware Lock Elision (HLE) and Restricted Transactional Memory (RTM), are provided to support software development using TSX. HLE is easy to use and maintains backward compatible with processors without TSX support while RTM is more flexible and scalable. Previous researches have shown that critical sections protected by RTM with a well-designed retry mechanism as its fallback code path can often achieve better performance than HLE. More parallel programs may be programmed in HLE, however, using RTM may obtain greater performance. To embrace both productivity and high performance of parallel program with TSX, we present a framework built on QEMU that can dynamically transform HLE instructions in an application binary to fragments of RTM codes with adaptive tuning on the fly. Compared to HLE execution, our prototype achieves 1.56x speedup with 8 threads on average. Due to the scalability of RTM, the speedup will be more significant as the number of threads increases.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call