Abstract

Normally, threads in a warp do not severely interfere with each other. However, the scheduler must wait until all the threads within complete before scheduling the next warp, resulting in memory divergence. The crux of the problem is scheduling the warp in a more reasonable order. Therefore, we propose a new warp scheduling strategy called WSMP, which is based on multi-level feedback queue (MFQ) and perceptron-based prefetch filtering (PPF). All the warps are sorted beforehand according to the latency tolerance of the warps and pushed into a certain queue in MFQ. We also remold PPF to enhance the modified underlying prefetcher. We are able to strike a balance between cache hit rate and prefetch coverage then. We verify its feasibility using GPGPU-Sim, along with exclusive GPGPU workload. The results show that compared to the baseline, WSMP improves IPC by 26.45% and reduces L2 cache miss rate by 9.54% on average.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call