BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning

@article{Yu2020BDD100KAD,
  title={BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning},
  author={Fisher Yu and Haofeng Chen and Xin Wang and Wenqi Xian and Yingying Chen and Fangchen Liu and Vashisht Madhavan and Trevor Darrell},
  journal={2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2020},
  pages={2633-2642}
}
  • F. Yu, H. Chen, +5 authors Trevor Darrell
  • Published 12 May 2018
  • Computer Science
  • 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Datasets drive vision progress, yet existing driving datasets are impoverished in terms of visual content and supported tasks to study multitask learning for autonomous driving. Researchers are usually constrained to study a small set of problems on one dataset, while real-world computer vision applications require performing tasks of various complexities. We construct BDD100K, the largest driving video dataset with 100K videos and 10 tasks to evaluate the exciting progress of image recognition… Expand
D2-City: A Large-Scale Dashcam Video Dataset of Diverse Traffic Scenarios
TLDR
This work proposes D$^2$-City, a large-scale comprehensive collection of dashcam videos collected by vehicles on DiDi's platform, which contains more than 10000 video clips which deeply reflect the diversity and complexity of real-world traffic scenarios in China. Expand
DeepBbox: Accelerating Precise Ground Truth Generation for Autonomous Driving Datasets
TLDR
DeepBbox is proposed, an algorithm that "corrects" loose object labels into right bounding boxes to reduce human annotation efforts and increase the number of object edges that are labeled automatically by 50% to reduce manual annotation time. Expand
A*3D Dataset: Towards Autonomous Driving in Challenging Environments
TLDR
A new challenging A*3D dataset which consists of RGB images and LiDAR data with a significant diversity of scene, time, and weather is introduced which addresses the gaps in the existing datasets to push the boundaries of tasks in autonomous driving research to more challenging highly diverse environments. Expand
Cross-Domain Car Detection Using Unsupervised Image-to-Image Translation: From Day to Night
TLDR
A model based on Generative Adversarial Networks (GANs) is explored to enable the generation of an artificial dataset with its respective annotations that is used to train a car detector model with annotated data from a source domain without requiring the image annotations of the target domain. Expand
Temporal Coherence for Active Learning in Videos
TLDR
This paper introduces a novel active learning approach for object detection in videos by exploiting temporal coherence by minimizing an energy function defined on a graphical model that provides estimates of both false positives and false negatives. Expand
The UAVid Dataset for Video Semantic Segmentation
TLDR
This paper introduces a new high resolution UAV video semantic segmentation dataset as complement, UAVid, consisting of 30 video sequences capturing high resolution images and provides several deep learning baseline methods, among which the proposed novel Multi-Scale-Dilation net performs the best via multi-scale feature extraction. Expand
WoodScape: A Multi-Task, Multi-Camera Fisheye Dataset for Autonomous Driving
TLDR
The first extensive fisheye automotive dataset, WoodScape, named after Robert Wood, which comprises of four surround view cameras and nine tasks including segmentation, depth estimation, 3D bounding box detection and soiling detection is released. Expand
TJU-DHD: A Diverse High-Resolution Dataset for Object Detection
TLDR
A diverse high-resolution dataset for object detection and pedestrian detection in self-driving vehicles and video surveillance, with a rich diversity in season variance, illumination variance, and weather variance is built. Expand
Deep traffic light detection by overlaying synthetic context on arbitrary natural images
TLDR
A method to generate artificial traffic-related training data for deep traffic light detectors using basic non-realistic computer graphics to blend fake traffic scenes on top of arbitrary image backgrounds that are not related to the traffic domain is proposed. Expand
nuScenes: A Multimodal Dataset for Autonomous Driving
Robust detection and tracking of objects is crucial for the deployment of autonomous vehicle technology. Image based benchmark datasets have driven development in computer vision tasks such as objectExpand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 46 REFERENCES
The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes
TLDR
The Mapillary Vistas Dataset is a novel, large-scale street-level image dataset containing 25000 high-resolution images annotated into 66 object categories with additional, instance-specific labels for 37 classes, aiming to significantly further the development of state-of-the-art methods for visual road-scene understanding. Expand
The Cityscapes Dataset for Semantic Urban Scene Understanding
TLDR
This work introduces Cityscapes, a benchmark suite and large-scale dataset to train and test approaches for pixel-level and instance-level semantic labeling, and exceeds previous attempts in terms of dataset size, annotation richness, scene variability, and complexity. Expand
CityPersons: A Diverse Dataset for Pedestrian Detection
TLDR
This work revisits CNN design and point out key adaptations, enabling plain FasterRCNN to obtain state-of-the-art results on the Caltech dataset, and introduces CityPersons, a new set of person annotations on top of the Cityscapes dataset, to achieve further improvement from more and better data. Expand
Vision meets robotics: The KITTI dataset
TLDR
A novel dataset captured from a VW station wagon for use in mobile robotics and autonomous driving research, using a variety of sensor modalities such as high-resolution color and grayscale stereo cameras and a high-precision GPS/IMU inertial navigation system. Expand
LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop
TLDR
This work proposes to amplify human effort through a partially automated labeling scheme, leveraging deep learning with humans in the loop, and constructs a new image dataset, LSUN, which contains around one million labeled images for each of 10 scene categories and 20 object categories. Expand
YouTube-8M: A Large-Scale Video Classification Benchmark
TLDR
YouTube-8M is introduced, the largest multi-label video classification dataset, composed of ~8 million videos (500K hours of video), annotated with a vocabulary of 4800 visual entities, and various (modest) classification models are trained on the dataset. Expand
Microsoft COCO: Common Objects in Context
We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of sceneExpand
A new performance measure and evaluation benchmark for road detection algorithms
TLDR
A novel, behavior-based metric which judges the utility of the extracted ego-lane area for driver assistance applications by fitting a driving corridor to the road detection results in the BEV is proposed. Expand
End-to-End Learning of Driving Models from Large-Scale Video Datasets
TLDR
This work advocates learning a generic vehicle motion model from large scale crowd-sourced video data, and develops an end-to-end trainable architecture for learning to predict a distribution over future vehicle egomotion from instantaneous monocular camera observations and previous vehicle state. Expand
Learning Deep Features for Scene Recognition using Places Database
TLDR
A new scene-centric database called Places with over 7 million labeled pictures of scenes is introduced with new methods to compare the density and diversity of image datasets and it is shown that Places is as dense as other scene datasets and has more diversity. Expand
...
1
2
3
4
5
...