Abstract

High Bandwidth Memory (HBM) is an additional memory layer between DDR and cache, and it currently exists in the form of Multi-Channel DRAM (MCDRAM) on the Intel Knight's Landing manycore architecture. Its purpose is to increase available memory bandwidth to maximize processor throughput. This work explores optimizing the label propagation community detection algorithm on the KNL, as this algorithm and its variants find broad usage in community detection. This algorithm's processing pattern also represents broader class of vertex-centric programs. As HBM becomes more common in new HPC systems, it is important to determine how best to exploit this memory layer for memory-starved graph and combinatorial algorithms. This work experimentally examines breaking up the algorithmic work into HBM-resident chunks, along with a parametric study of associated variations and optimizations. In general, we find our chunking methodology does not harm solution quality and can improve time to solution for label propagation. We believe these results would likely generalize to other vertex-centric algorithms as well.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.