• Corpus ID: 220250329

1st Place Solutions for Waymo Open Dataset Challenges - 2D and 3D Tracking

  title={1st Place Solutions for Waymo Open Dataset Challenges - 2D and 3D Tracking},
  author={Yu Wang and Sijia Chen and Li Huang and Runzhou Ge and Yihan Hu and Zhuangzhuang Ding and Jie Liao},
This technical report presents the online and real-time 2D and 3D multi-object tracking (MOT) algorithms that reached the 1st places on both Waymo Open Dataset 2D tracking and 3D tracking challenges. An efficient and pragmatic online tracking-by-detection framework named HorizonMOT is proposed for camera-based 2D tracking in the image space and LiDAR-based 3D tracking in the 3D world space. Within the tracking-by-detection paradigm, our trackers leverage our high-performing detectors used in… 

Figures and Tables from this paper

Cross-Modal 3D Object Detection and Tracking for Auto-Driving

This paper proposes a cross-modal fusion scheme that fuses camera appearance feature with LiDAR feature to facilitate 3D detection and tracking and attaches an additional branch to the 3D detector to output instance-aware appearance embedding, which significantly improves tracking performance with the designed association mechanisms.

Monocular Quasi-Dense 3D Object Tracking

This work proposes a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform and an LSTM-based object velocity learning module aggregates the long-term trajectory information for more accurate motion extrapolation.

Frustum Votenet and its Application to the Waymo Lidar Dataset

This work benchmarked Votenet on (a subset of) the Waymo lidar dataset, and observed that Frustum VoteNet outperformed status quo VoteNet by a factor of 6 in terms of mAP.

Local Metrics for Multi-Object Tracking

It is shown that the historical Average Tracking Accuracy (ATA) metric exhibits superior sensitivity to association, enabling its proposed local variant, ALTA, to capture a wide range of characteristics.

AFDetV2: Rethinking the Necessity of the Second Stage for Object Detection from Point Clouds

This work has devised a single-stage anchor-free network, named AFDetV2, that achieves the state-of-the-art results on these two datasets, superior to all the prior arts, including both the single- stage and the two-stage 3D detectors.

Real-Time Anchor-Free Single-Stage 3D Detection with IoU-Awareness

This report introduces the winning solution to the Real-time 3D Detection and also the “Most Efficient Model” in the Waymo Open Dataset Challenges at CVPR 2021 and makes a handful of modifications to the base model to improve the accuracy and at the same time to greatly reduce the latency.



Probabilistic 3D Multi-Object Tracking for Autonomous Driving

This paper presents the on-line tracking method, which made the first place in the NuScenes Tracking Challenge, and outperforms the AB3DMOT baseline method by a large margin in the Average Multi-Object Tracking Accuracy (AMOTA) metric.

Tracking Objects as Points

Tracking has traditionally been the art of following interest points through space and time. This changed with the rise of powerful deep networks. Nowadays, tracking is dominated by pipelines that

AFDet: Anchor Free One Stage 3D Object Detection

This work proposes an anchor free and Non-Maximum Suppression free one stage detector called AFDet that can be processed efficiently on a CNN accelerator or a GPU with the simplified post-processing.

High-Speed tracking-by-detection without using image information

This work presents a tracking-by-detection algorithm which can compete with more sophisticated approaches at a fraction of the computational cost and shows with thorough experiments its potential using a wide range of object detectors.

Simple online and realtime tracking with a deep association metric

This paper integrates appearance information to improve the performance of SORT and reduces the number of identity switches, achieving overall competitive performance at high frame rates.

RetinaTrack: Online Single Stage Joint Detection and Tracking

This paper proposes a conceptually simple and efficient joint model of detection and tracking, called RetinaTrack, which modifies the popular single stage RetinaNet approach such that it is amenable to instance-level embedding training.

Scalability in Perception for Autonomous Driving: Waymo Open Dataset

This work introduces a new large scale, high quality, diverse dataset, consisting of well synchronized and calibrated high quality LiDAR and camera data captured across a range of urban and suburban geographies, and studies the effects of dataset size and generalization across geographies on 3D detection methods.

Simple online and realtime tracking

Despite only using a rudimentary combination of familiar techniques such as the Kalman Filter and Hungarian algorithm for the tracking components, this approach achieves an accuracy comparable to state-of-the-art online trackers.

A Simple Baseline for Multi-Object Tracking

This work studies the essential reasons behind the failure of object detection and re-identification, and presents a simple baseline that remarkably outperforms the state-of-the-arts on the public datasets at $30$ fps.

Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics

This work introduces two intuitive and general metrics to allow for objective comparison of tracker characteristics, focusing on their precision in estimating object locations, their accuracy in recognizing object configurations and their ability to consistently label objects over time.