Deployment of Deep Neural Networks for Object Detection on Edge AI Devices with Runtime Optimization

@article{Stcker2021DeploymentOD,
  title={Deployment of Deep Neural Networks for Object Detection on Edge AI Devices with Runtime Optimization},
  author={Lukas St{\"a}cker and Juncong Fei and Philipp Heidenreich and Frank Bonarens and Jason Raphael Rambach and Didier Stricker and Christoph Stiller},
  journal={2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)},
  year={2021},
  pages={1015-1022}
}
Deep neural networks have proven increasingly important for automotive scene understanding with new algorithms offering constant improvements of the detection performance. However, there is little emphasis on experiences and needs for deployment in embedded environments. We therefore perform a case study of the deployment of two representative object detection networks on an edge AI platform. In particular, we consider RetinaNet for image-based 2D object detection and PointPillars for LiDAR… 

Figures and Tables from this paper

YOLOv5-R: lightweight real-time detection based on improved YOLOv5

This work proposes a lightweight, real-time detection algorithm, YOLOv5-R, which significantly reduces the network size and improves the inference speed and is deployed into the AI embedded device Jetson Nano for TensorRT acceleration.

YOLO-GD: A Deep Learning-Based Object Detection Algorithm for Empty-Dish Recycling Robots

A deep learning-based object detection algorithm for empty-dish recycling robots to automatically recycle dishes in restaurants and canteens, etc, using a lightweight object detection model YOLO-GD.

3D Harmonic Loss: Towards Task-consistent and Time-friendly 3D Object Detection on Edge for Intelligent Transportation System

A 3D harmonic loss function is proposed to relieve the pointcloud based inconsistent predictions of object detection in the point cloud domain due to large sparsity, and the feasibility of 3D musical loss is demonstrated from a mathematical optimization perspective.

Design Methodology for Deep Out-of-Distribution Detectors in Real-Time Cyber-Physical Systems

A design methodology is proposed to tune deep OOD detectors to meet the accuracy and response time requirements of embedded applications and it is shown that this design methodology can lead to a drastic reduction in response time in relation to an unoptimized OOD detector while maintaining comparable accuracy.

Data-Model-Hardware Tri-Design for Energy-Efficient Video Intelligence

A data-model-hardware tri-design frame-work for high-throughput, low-cost, and high-accuracymulti-objecttracking(MOT) onHigh-Definition(HD)videostream on high-definition TVream is proposed.

Data-Model-Circuit Tri-Design for Ultra-Light Video Intelligence on Edge Devices

A data-model-hardware tri-design frame-work for high-throughput, low-cost, and high-accuracymulti-objecttracking(MOT) onHigh-Definition(HD)videostream on high-definition TVream is proposed.

References

SHOWING 1-10 OF 26 REFERENCES

PointPillars: Fast Encoders for Object Detection From Point Clouds

benchmarks suggest that PointPillars is an appropriate encoding for object detection in point clouds, and proposes a lean downstream network.

Feature Pyramid Networks for Object Detection

This paper exploits the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost and achieves state-of-the-art single-model results on the COCO detection benchmark without bells and whistles.

Focal Loss for Dense Object Detection

This paper proposes to address the extreme foreground-background class imbalance encountered during training of dense detectors by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples, and develops a novel Focal Loss, which focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training.

Focal Loss for Dense Object Detection

This paper proposes to address the extreme foreground-background class imbalance encountered during training of dense detectors by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples, and develops a novel Focal Loss, which focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training.

Deep Residual Learning for Image Recognition

This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.

nuScenes: A Multimodal Dataset for Autonomous Driving

Robust detection and tracking of objects is crucial for the deployment of autonomous vehicle technology. Image based benchmark datasets have driven development in computer vision tasks such as object

Are we ready for autonomous driving? The KITTI vision benchmark suite

The autonomous driving platform is used to develop novel challenging benchmarks for the tasks of stereo, optical flow, visual odometry/SLAM and 3D object detection, revealing that methods ranking high on established datasets such as Middlebury perform below average when being moved outside the laboratory to the real world.

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

This paper designs a novel type of neural network that directly consumes point clouds, which well respects the permutation invariance of points in the input and provides a unified architecture for applications ranging from object classification, part segmentation, to scene semantic parsing.

ImageNet classification with deep convolutional neural networks

A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.

Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation

This paper presents a workflow for 8-bit quantization that is able to maintain accuracy within 1% of the floating-point baseline on all networks studied, including models that are more difficult to quantize, such as MobileNets and BERT-large.