Abstract
Memory fences are widely used to ensure the correctness for synchronization constructs on machines with relaxed consistency models. However, they are expensive and usually impose over-constrained ordering that causes unnecessary CPU stalls. In this paper, we observe that memory fences in TSO are merely intended to order synchronization variables. Based on this observation, we rethink the hardware-software interface of synchronization constructs on multicore processors and propose a new design called Sync-Order that differentiates synchronization variables ( sync-vars ) from normal ones. Sync-Order reduces hardware complexity such that the processor only needs to serialize the ordering among sync-vars . Its simplicity makes it easy to be integrated to the directory controller and it supports distributed directory, a missing feature in prior designs. We show that Sync-Order eliminates traditional fences on all sides of synchronization constructs (instead of only one side in prior work) and requires small effort for a programmer or compiler to annotate sync-vars . Our experimental results show that Sync-Order significantly reduces CPU stalls and boosts the performance of a set of synchronization constructs and concurrent data structures by 10 percent; meanwhile, the fence overhead of full applications from SPLASH-2 and PARSEC is reduced from 42 to 3 percent.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Parallel and Distributed Systems
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.