Optimization of the Load Balancing Policy for Tiled Many-Core Processors

Ye Liu,Shinpei Kato,Masato Edahiro

doi:10.1109/access.2018.2883415

Abstract

Tiled many-core processors (i.e., KNL and the TILE-Gx72 processor), on which processing cores are fitted onto a single chip and cores are interconnected via mesh-based networks, are different from the traditional many-core systems. Their operating system (OS) should be optimized to take into account the unique characteristics (for instance, cores are integrated into a single chip) of tiled many-core processors. This is because these characteristics were not taken into consideration when OSes designed for the traditional multicore (many-core) systems were deployed on tiled many-core processors. In this paper, we propose an optimized load balancing policy to improve the performance of multi-threaded applications. Making a thread select an appropriate idle (lightweight) tile (processing core) across all tiles on the single chip rather than a portion of tiles is able to reduce the overhead triggered by the load balancing policy, the penalty of cache misses because of the scheduling and more threads sharing the same tile (processing core), and the contention for memory controllers due to cache misses. The experimental results demonstrate that the optimized load balancing policy can provide up to 2.7× performance improvement on KNL and mitigate the performance degradation to separate extents on the TILE-Gx72 processor.

Highlights

INTRODUCTIONScalability problems, in which the execution time of a multithreaded application designed to take advantage of softwarelevel parallelism (and hardware-level parallelism) cannot be reduced as more threads (processing cores) need to cooperate in the parallel phase(s), are still challenges for application programmers, library (i.e., heap manager) designers, and OS (operating system) designers
Scalability problems, in which the execution time of a multithreaded application designed to take advantage of softwarelevel parallelism cannot be reduced as more threads need to cooperate in the parallel phase(s), are still challenges for application programmers, library designers, and operating system (OS) designers
Since tiles, remote memory controllers, and local memory controllers are integrated onto the same chip, and non-uniform memory access latency does not dominate program performance on tiled many-core processors, the blocked thread can be awakened on any idle tile on the single chip, instead of its previous scheduling domain that includes a portion of tiles

Summary

INTRODUCTION

Scalability problems, in which the execution time of a multithreaded application designed to take advantage of softwarelevel parallelism (and hardware-level parallelism) cannot be reduced as more threads (processing cores) need to cooperate in the parallel phase(s), are still challenges for application programmers, library (i.e., heap manager) designers, and OS (operating system) designers. We explain that performance of multi-threaded sharedmemory applications designed for chip multiprocessors can be improved when the policy of load balancing in the Linux kernel is optimized on tiled many-core processors. Since tiles (processing cores), remote memory controllers, and local memory controllers are integrated onto the same chip, and non-uniform memory access latency does not dominate program performance on tiled many-core processors, the blocked thread can be awakened on any idle (or lightweight) tile (processing core) on the single chip, instead of its previous scheduling domain that includes a portion of tiles This is related to the optimized load balancing policy in the Linux kernel proposed in this paper.

BACKGROUND

PERFORMANCE EVALUATION

DISCUSSION

Findings

RELATED WORK

CONCLUSION AND FUTURE WORK

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2019
Citations: 3	License type: CC BY 3.0

R Discovery Prime

R Discovery Prime

Optimization of the Load Balancing Policy for Tiled Many-Core Processors

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

DRACON: A Dedicated Hardware Infrastructure for Scalable Run-Time Management on Many-Core Systems
Daniel Gregorek ... Jochen Rust
IEEE Access | VOL. 7
Daniel Gregorek, et. al.Daniel Gregorek ... Jochen Rust
01 Jan 2019
IEEE Access | VOL. 7

32 - SHORT COURSE IN MICROPROCESSORS FOR SOLAR ENERGY APPLICATIONS MICROPROCESSORS, MICROCOMPUTERS AND SINGLE CHIP COMPUTERS
B.E Paton
Solar Energy Conversion | VOL. -
B.E PatonB.E Paton
01 Jan 1979
Solar Energy Conversion | VOL. -

Optical dielectric rod antenna for on-chip communications
Hongyu Zhou ... Dejan S Filipovic
-
Hongyu Zhou, et. al. Hongyu Zhou ... Dejan S Filipovic
01 Jul 2010
01 Jul 2010

8 - Single chip computers
Steve Money
Newnes Microprocessor Pocket Book | VOL. -
Steve MoneySteve Money
01 Jan 1989
Newnes Microprocessor Pocket Book | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Optimization of the Load Balancing Policy for Tiled Many-Core Processors

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access