Abstract

Extending compilers like LLVM with NUMA-aware optimisations significantly improves runtime performance and energy consumption on NUMA systems. The paper presents NUMA-BTDM algorithm, which is a compile-time thread-type dependent mapping algorithm that performs the mapping uniformly based on the type of each thread given by NUMA-BTLP algorithm following a static analysis on the code. First, the compiler inserts in the program code architecture dependent code that detects at runtime the characteristics of the underlying architecture for Intel processors, and then the mapping is performed at runtime (using specific functions calls from the PThreads library) depending on these characteristics following a compile-time mapping analysis which gives the CPU affinity of each thread. NUMA-BTDM allows the application to customise, control and optimise the thread mapping and achieves balanced data locality on NUMA systems for C parallel code that combine PThreads based task parallelism with OpenMP based loop parallelism.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.