GPUGuard

Qiumin Xu,Hoda Naghibijouybari,Shibo Wang,Murali Annavaram,Nael Abu-Ghazaleh

doi:10.1145/3330345.3330389

Abstract

Graphics processing units (GPUs) are moving towards supporting concurrent kernel execution where multiple kernels may be co-executed on the same GPU and even on the same streaming multiprocessor (SM) core. While concurrent kernel execution improves hardware resource utilization, it opens up vulnerabilities to covert-channel and side-channel attacks. These attacks exploit information leakage across kernels that results from contention on shared resources; they have been shown to be a dangerous threat on CPUs, and are starting to be demonstrated on GPUs. The unique micro-architectural features of GPUs, such as specialized cache structures and massive parallel thread support, create opportunities for GPU-specific channels to be formed. In this paper, we propose GPUGuard, a decision tree based detection and a hierarchical defense framework that can reliably close the covert channels. Our results show that GPUGuard can detect contention with 100% sensitivity and a small (8.5%) false positive rate. The timing channels are mitigated through Tangram, a GPU-specific contention channel elimination scheme, with only 8% to 23% overhead when there is an attack and zero performance overhead when no attacks are detected. Compared to temporal partitioning, GPUGuard is 69%-96% faster in various architectures even when active, showing that it is possible to gain substantial performance from executing concurrent kernels on a single SM while securing GPUs against these attacks.

Full Text