• Corpus ID: 243832781

Safe and Practical GPU Acceleration in TrustZone

  title={Safe and Practical GPU Acceleration in TrustZone},
  author={Heejin Park and Felix Xiaozhu Lin},
We present a holistic design for GPU-accelerated computation in TrustZone TEE. Without pulling the complex GPU software stack into the TEE, we follow a simple approach: record the CPU/GPU interactions ahead of time, and replay the interactions in the TEE at run time. This paper addresses the approach’s key missing piece – the recording environment, which needs both strong security and access to diverse mobile GPUs. To this end, we present a novel architecture called CODY, in which a mobile… 

Figures and Tables from this paper

SoK: Machine Learning with Confidential Computing

Systematize the key challenges and dedicated analyses of the limitations in existing Trusted Execution Environ- ment (TEE) systems for ML use cases and discuss prospective works, including grounded privacy definitions, partitioned ML executions, dedicated TEE designs for ML, TEE-aware ML, and ML full pipeline guarantee.



Heterogeneous Isolated Execution for Commodity GPUs

This work implements the proposed HIX architecture on an emulated machine with KVM and QEMU, and shows that the performance overhead for security is curtailed to 26% on average for the Rodinia benchmark, while providing secure isolated GPU computing.

Graviton: Trusted Execution Environments on GPUs

Graviton enables applications to offload security- and performance-sensitive kernels and data to a GPU, and execute kernels in isolation from other code running on the GPU and all software on the host, including the device driver, the operating system, and the hypervisor.

Telekine: Secure Computing with Cloud GPUs

Telekine enables applications to use GPU acceleration in the cloud securely, based on a novel GPU stream abstraction that ensures execution and interaction through untrusted components are independent of any secret data.

TinyStack: A Minimal GPU Stack for Client ML

TinyStack is a novel way for deploying GPU-accelerated computation on mobile and embedded devices that addresses challenges in capturing key CPU/GPU interactions and GPU states, working around proprietary GPU internals, and preventing replay divergence.

NoMali: Simulating a realistic graphics driver stack using a stub GPU

  • R. D. JongAndreas Sandberg
  • Computer Science
    2016 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)
  • 2016
This paper uses gem5 to quantify the effects of software rendering on a set of common mobile workloads and introduces the NoMali stub GPU model that can be used as a drop-in replacement for a real Mali GPU model.

Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware

Slalom is proposed, a framework that securely delegates execution of all linear layers in a DNN from a TEE to a faster, yet untrusted, co-located processor, for high performance execution of Deep Neural Networks in TEEs.

Panoply: Low-TCB Linux Applications With SGX Enclaves

A new system called PANOPLY is presented which bridges the gap between the SGX-native abstractions and the standard OS abstractions which feature-rich, commodity Linux applications require and enables much stronger security in 4 real-world applications — including Tor, OpenSSL, and web services — which can base security on hardware-root of trust.

COMET: Code Offload by Migrating Execution Transparently

The prototype of COMET (Code Offload by Migrating Execution Transparently), a realization of this design built on top of the Dalvik Virtual Machine, leverages the underlying memory model of the runtime to implement distributed shared memory (DSM) with as few interactions between machines as possible.

StreamBox-TZ: Secure Stream Analytics at the Edge with TrustZone

StreamBox-TZ (SBT), a stream analytics engine for an edge platform that offers strong data security, verifiable results, and good performance, and is designed and optimized for a TEE based on ARM TrustZone.

CloneCloud: elastic execution between mobile device and cloud

The design and implementation of CloneCloud is presented, a system that automatically transforms mobile applications to benefit from the cloud that enables unmodified mobile applications running in an application-level virtual machine to seamlessly off-load part of their execution from mobile devices onto device clones operating in a computational cloud.