Exploring hardware work queue support for lightweight threads in MPSoCs

Rahul R Sharma,Ron Sass,Yamuna Rajasekhar

doi:10.1109/reconfig.2012.6416747

Abstract

Fine-grain thread parallelism using task based programming models are a new trend in achieving massively parallel computations. Often, software pre-fetching and queuing mechanisms for managing these dynamic environments are inadequate, failing to keep the processor cores busy with computation. At the same time, the CPU-memory performance gap is getting worse and this puts a strain on memory subsystem to keep cores in a busy state. We describe a hardware based pre-fetching and queuing mechanism aimed at assisting the over-subscription of very lightweight threads per core. Experiments with a soft processor and a reconfigurable accelerator core are reported. The hardware demonstrates the ability to block on out-of-order memory transactions and alleviates the software bottleneck.

Full Text