• Corpus ID: 237292854

YOLOP: You Only Look Once for Panoptic Driving Perception

  title={YOLOP: You Only Look Once for Panoptic Driving Perception},
  author={Dongsheng Wu and Manwen Liao and Weitian Zhang and Xinggang Wang},
A panoptic driving perception system is an essential part of autonomous driving. A high-precision and real-time perception system can assist the vehicle in making the reason-able decision while driving. We present a panoptic driving perception network (YOLOP) to perform traffic object detection, drivable area segmentation and lane detection simultaneously. It is composed of one encoder for feature extraction and three decoders to handle the specific tasks. Our model performs extremely well on the… 

YOLOPv2: Better, Faster, Stronger for Panoptic Driving Perception

This paper proposed an effective andcient multi-task learning network to simultaneously perform the task of traffic object detection, drivable road area segmentation and lane detection and achieved the new state-of-the-art (SOTA) performance in terms of accuracy and speed on the challenging BDD100K dataset.

Joint Semantic Understanding with a Multilevel Branch for Driving Perception

This study proposes a multi-task learning framework for simultaneous traffic object detection, drivable area segmentation, and lane line segmentation in an efficient way and demonstrates the effectiveness of this framework on a BerkeleyDeepDrive100K dataset.

Effective Adaptation in Multi-Task Co-Training for Unified Autonomous Driving

A simple yet effective pretrain-adapt-finetune paradigm for general multi-task training, where the off-the-shelf pretrained models can be effectively adapted without increasing the training overhead is proposed.

HybridNets: End-to-End Perception Network

An end-to-end perception network to perform multi-tasking, including traffic object detection, drivable area segmentation and lane detection simultaneously, called HybridNets, which achieves better accuracy than prior art and can perform visual perception tasks in real-time and thus is a practical and accurate solution to the multi-Tasking problem.

Robust Lane Detection via Filter Estimator and Data Augmentation

This paper uses a network architecture composed of Encoder-Decoder with a Feature Shift Aggregator between them to make the prediction more comprehensive and achieves the result accuracy rate of SOTA on the TuSimple dataset.

Efficient Perception, Planning, and Control Algorithms for Vision-Based Automated Vehicles

The proposed CILQR controllers are shown to be more efficient than the sequential quadratic programming (SQP) methods and can collaborate with the MTUNet to drive a car autonomously in unseen simulation environments for lane-keeping and car-following maneuvers.

A Multi-Task Model for Sea-Sky Scene Perception with Information Intersection

A Multi-Task Model for Sea-Sky Scene Perception which contains several parts to complete both of sub-tasks through one end-to-end inference and achieves a promising performance in both latency and precision.

A Quality Index Metric and Method for Online Self-Assessment of Autonomous Vehicles Sensory Perception

Perception is critical to autonomous driving safety. Camera-based object detection is one of the most important methods for autonomous vehicle perception. Current camerabased object detection

Improving Road Segmentation in Challenging Domains using Similar Place Priors

This research uses Visual Place Recognition to find similar but geographically distinct places, and fuse segmentations for query images and these similar place priors using a Bayesian approach and novel segmentation quality metric to improve road segmentation based on similar places.

An Improved Tiered Head Pose Estimation Network with Self-Adjust Loss Function

A THESL-Net (tiered head pose estimation with self-adjustment loss network) model is proposed, gaining a greater freedom during angle estimation and outperforms the state-of-the-art approaches.



DLT-Net: Joint Detection of Drivable Areas, Lane Lines, and Traffic Objects

This work proposes a unified neural network named DLT-Net to detect drivable areas, lane lines, and traffic objects simultaneously, and constructs context tensors between sub-task decoders to share designate influence among tasks.

Towards End-to-End Lane Detection: an Instance Segmentation Approach

A fast lane detection algorithm, running at 50 fps, is proposed, which can handle a variable number of lanes and cope with lane changes, and is robust against road plane changes, unlike existing approaches that rely on a fixed, predefined transformation.

Learning Lightweight Lane Detection CNNs by Self Attention Distillation

It is observed that attention maps extracted from a model trained to a reasonable level would encode rich contextual information that can be used as a form of ‘free’ supervision for further representation learning through performing top- down and layer-wise attention distillation within the net- work itself.

Ultra Fast Structure-aware Deep Lane Detection

A novel, simple, yet effective formulation aiming at extremely fast speed and challenging scenarios, which treats the process of lane detection as a row-based selecting problem using global features and proposes a structural loss to explicitly model the structure of lanes.

YOLO9000: Better, Faster, Stronger

YOLO9000, a state-of-the-art, real-time object detection system that can detect over 9000 object categories, is introduced and a method to jointly train on object detection and classification is proposed, both novel and drawn from prior work.

You Only Look Once: Unified, Real-Time Object Detection

Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

Scaled-YOLOv4: Scaling Cross Stage Partial Network

We show that the YOLOv4 object detection neural network based on the CSP approach, scales both up and down and is applicable to small and large networks while maintaining optimal speed and accuracy.

YOLOv4: Optimal Speed and Accuracy of Object Detection

This work uses new features: WRC, CSP, CmBN, SAT, Mish activation, Mosaic data augmentation, C mBN, DropBlock regularization, and CIoU loss, and combine some of them to achieve state-of-the-art results: 43.5% AP for the MS COCO dataset at a realtime speed of ~65 FPS on Tesla V100.

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

This paper proposes a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%.

Location-Sensitive Visual Recognition with Cross-IOU Loss

Evaluated on the MS-COCO dataset, LSNet set the new state-of-theart accuracy for anchor-free object detection and instance segmentation, and shows promising performance in detecting multi-scale human poses.