Abstract
Implementation of concurrent data structures in architectures that provide limited synchronization primitives is a critical challenge. Typical lock-based implementations suffer from well-known problems such as poor scalability and unfairness. In this paper, we propose a client-server based synchronization model that can be applied in data structures with low level of parallelism for distributed shared memory many-core systems that support also message-passing communication. Additionally, we utilize a programmable hardware accelerator with appropriate application interfaces to overcome the performance-flexibility dilemma. Experimental results show that the proposed work performs 20 $\times$ faster than the single lock model with 88 $\times$ less idle cycles and 7 $\times$ less power consumption.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have