• Corpus ID: 243832781

Safe and Practical GPU Acceleration in TrustZone

  title={Safe and Practical GPU Acceleration in TrustZone},
  author={Heejin Park and Felix Xiaozhu Lin},
We present a holistic design for GPU-accelerated computation in TrustZone TEE. Without pulling the complex GPU software stack into the TEE, we follow a simple approach: record the CPU/GPU interactions ahead of time, and replay the interactions in the TEE at run time. This paper addresses the approach’s key missing piece – the recording environment, which needs both strong security and access to diverse mobile GPUs. To this end, we present a novel architecture called CODY, in which a mobile… 

Figures and Tables from this paper


Heterogeneous Isolated Execution for Commodity GPUs
This work implements the proposed HIX architecture on an emulated machine with KVM and QEMU, and shows that the performance overhead for security is curtailed to 26% on average for the Rodinia benchmark, while providing secure isolated GPU computing.
Graviton: Trusted Execution Environments on GPUs
Graviton enables applications to offload security- and performance-sensitive kernels and data to a GPU, and execute kernels in isolation from other code running on the GPU and all software on the host, including the device driver, the operating system, and the hypervisor.
Telekine: Secure Computing with Cloud GPUs
Telekine enables applications to use GPU acceleration in the cloud securely, based on a novel GPU stream abstraction that ensures execution and interaction through untrusted components are independent of any secret data.
TinyStack: A Minimal GPU Stack for Client ML
TinyStack is a novel way for deploying GPU-accelerated computation on mobile and embedded devices that addresses challenges in capturing key CPU/GPU interactions and GPU states, working around proprietary GPU internals, and preventing replay divergence.
NoMali: Simulating a realistic graphics driver stack using a stub GPU
This paper uses gem5 to quantify the effects of software rendering on a set of common mobile workloads and introduces the NoMali stub GPU model that can be used as a drop-in replacement for a real Mali GPU model.
Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware
Slalom is proposed, a framework that securely delegates execution of all linear layers in a DNN from a TEE to a faster, yet untrusted, co-located processor, for high performance execution of Deep Neural Networks in TEEs.
Panoply: Low-TCB Linux Applications With SGX Enclaves
A new system called PANOPLY is presented which bridges the gap between the SGX-native abstractions and the standard OS abstractions which feature-rich, commodity Linux applications require and enables much stronger security in 4 real-world applications — including Tor, OpenSSL, and web services — which can base security on hardware-root of trust.
COMET: Code Offload by Migrating Execution Transparently
The prototype of COMET (Code Offload by Migrating Execution Transparently), a realization of this design built on top of the Dalvik Virtual Machine, leverages the underlying memory model of the runtime to implement distributed shared memory (DSM) with as few interactions between machines as possible.
StreamBox-TZ: Secure Stream Analytics at the Edge with TrustZone
StreamBox-TZ (SBT), a stream analytics engine for an edge platform that offers strong data security, verifiable results, and good performance, and is designed and optimized for a TEE based on ARM TrustZone.
CloneCloud: elastic execution between mobile device and cloud
The design and implementation of CloneCloud is presented, a system that automatically transforms mobile applications to benefit from the cloud that enables unmodified mobile applications running in an application-level virtual machine to seamlessly off-load part of their execution from mobile devices onto device clones operating in a computational cloud.