Abstract

Multiprocessor systems on a chip consisting of integrated CPUs and GPUs are suitable platforms for real-time embedded applications requiring massively parallel processing. For such applications, lifetime reliability due to permanent faults and soft-error reliability due to transient faults are major concerns. Detailed execution profiling has revealed that a CUDA task’s CPU execution time significantly increases if the task executes on a different core than the operating system (OS). Based on this observation, an extended task model is introduced to consider the execution time dependencies among tasks and the OS. A hybrid framework is proposed to improve soft-error reliability while satisfying a lifetime reliability constraint for soft real-time systems executing on integrated CPU and GPU platforms. This framework: 1) reduces the total utilization of cores and improves soft-error reliability via off-line task mapping; 2) achieves a higher lifetime reliability through task migration at run time; and 3) improves soft-error reliability by dynamically scaling frequencies of CPU and GPU cores. The experimental results show that the proposed framework leads to a system that can execute without soft errors for at least 4 days (4 times) and 6 days (6 times) longer, on average, than existing approaches.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.