The Case for Adaptive Deep Neural Networks in Edge Computing

  title={The Case for Adaptive Deep Neural Networks in Edge Computing},
  author={Francis McNamee and Schahram Dustdar and Peter Kilpatrick and Weisong Shi and Ivor T. A. Spence and Blesson Varghese},
  journal={2021 IEEE 14th International Conference on Cloud Computing (CLOUD)},
Deep Neural Networks (DNNs) are an application class that benefit from being distributed across the edge and cloud. A DNN is partitioned such that specific layers of the DNN are deployed onto the edge and the cloud to meet performance and privacy objectives. However, there is limited understanding of: whether and how evolving operational conditions (increased CPU and memory utilization at the edge or reduced data transfer rates between the edge and cloud) affect the performance of already… 

CONTINUER: Maintaining Distributed DNN Services During Edge Failures

ContinUER, a framework for estimating the accuracy and latency when using the techniques for distributed DNNs and selecting the best technique given user-defined objectives when an edge node fails, is developed.

NEUKONFIG: Reducing Edge Service Downtime When Repartitioning DNNs

This paper presents the NEUKONFIG framework that identifies the service downtime incurred when repartitioning DNNs and proposes approaches for reducing edge service downtime, based on ‘Dynamic Switching’ in which a new edge-cloud pipeline is initialised with new DNN partitions.



Convergence of Edge Computing and Deep Learning: A Comprehensive Survey

By consolidating information scattered across the communication, networking, and DL areas, this survey can help readers to understand the connections between enabling technologies while promoting further discussions on the fusion of edge intelligence and intelligent edge, i.e., Edge DL.

DeepThings: Distributed Adaptive Deep Learning Inference on Resource-Constrained IoT Edge Clusters

DeepThings is proposed, a framework for adaptively distributed execution of CNN-based inference applications on tightly resource-constrained IoT edge clusters that employs a scalable Fused Tile Partitioning of convolutional layers to minimize memory footprint while exposing parallelism.

Privacy-Aware Edge Computing Based on Adaptive DNN Partitioning

This paper designs an offloading strategy that adaptively partitions the DNN in varying network environments to make the optimal tradeoff between performance and privacy for battery-powered mobile devices.

MoDNN: Local distributed mobile computing system for Deep Neural Network

MoDNN is proposed — a local distributed mobile computing system for DNN applications that can partition already trained DNN models onto several mobile devices to accelerate DNN computations by alleviating device-level computing cost and memory usage.

Distributed Inference Acceleration with Adaptive DNN Partitioning and Offloading

Results obtained by using a self-driving car dataset and several DNN benchmarks show that the proposed solution significantly reduces the total latency for DNN inference compared to other distributed approaches and is 2.6 to 4.2 times faster than the state- art.

Dynamic Adaptive DNN Surgery for Inference Acceleration on the Edge

The DNN surgery is designed, which allows partitioned DNN processed at both the edge and cloud while limiting the data transmission, and a Dynamic Adaptive DNN Surgery (DADS) scheme, which optimally partitions the DNN under different network condition.

Edge Intelligence: Paving the Last Mile of Artificial Intelligence With Edge Computing

A comprehensive survey of the recent research efforts on EI is conducted, which provides an overview of the overarching architectures, frameworks, and emerging key technologies for deep learning model toward training/inference at the network edge.

Distributing Deep Neural Networks with Containerized Partitions at the Edge

A containerized partition-based runtime adaptive convolutional neural network (CNN) acceleration framework for Internet of Things (IoT) environments that leverages spatial partitioning techniques through convolution layer fusion to dynamically select the optimal partition according to the availability of computational resources and network conditions.

ADDA: Adaptive Distributed DNN Inference Acceleration in Edge Computing Environment

An adaptive distributed DNN inference acceleration framework for edge computing environment is proposed in this paper, where DNN computation path optimization and DNN compute partition optimization are taken into consideration and demonstrate that the method can effectively accelerate the DNN insight compared to the state-of-the-art methods.

Adaptive deep learning model selection on embedded systems

This paper presents an adaptive scheme to determine which DNN model to use for a given input, by considering the desired accuracy and inference time, and considers a range of influential DNN models.