EagerMOT: 3D Multi-Object Tracking via Sensor Fusion

  title={EagerMOT: 3D Multi-Object Tracking via Sensor Fusion},
  author={Aleksandr Kim and Aljosa Osep and Laura Leal-Taix{\'e}},
  journal={2021 IEEE International Conference on Robotics and Automation (ICRA)},
Multi-object tracking (MOT) enables mobile robots to perform well-informed motion planning and navigation by localizing surrounding objects in 3D space and time. Existing methods rely on depth sensors (e.g., LiDAR) to detect and track targets in 3D space, but only up to a limited sensing range due to the sparsity of the signal. On the other hand, cameras provide a dense and rich visual signal that helps to localize even distant objects, but only in the image domain. In this paper, we propose… 

Figures and Tables from this paper

DeepFusionMOT: A 3D Multi-Object Tracking Framework Based on Camera-LiDAR Fusion With Deep Association

Extensive experiments indicate that the proposed robust and fast camera-LiDAR fusion-based MOT method presents obvious advantages over the state-of-the-art MOT methods in terms of both tracking accuracy and processing speed.

CLAMOT: 3D Detection and Tracking via Multi-modal Feature Aggregation

A camera and LiDAR aggregation module named CLA-fusion is proposed to fuse the two modal features in a point-wise manner and adopts a center-based method, which means detecting the centers of objects by a keypoint detector and regressing other attributes, like 3D size, velocity, etc.

InterTrack: Interaction Transformer for 3D Multi-Object Tracking

This work introduces the Interaction Transformer for 3D MOT to generate discriminative object representations for data association, and extracts state and shape features for each track and detection, and aggregate global information via attention.

Monocular 3D Multi-Object Tracking with an EKF Approach for Long-Term Stable Tracks

This paper presents a multi-object tracking approach composed of an Extended Kalman filter estimating the 3D state by using these detections for track initialization and shows that this 3D representation is very valuable as it achieves state-of-the-art results on the KITTI dataset with an association solely based on 2D bounding box comparison.

CAMO-MOT: Combined Appearance-Motion Optimization for 3D Multi-Object Tracking with Camera-LiDAR Fusion

This work proposes a novel camera-LiDAR fusion 3D MOT framework based on the Combined Appearance-Motion Optimization (CAMO-MOT), which uses both camera and LiDAR data and reduces tracking failures caused by occlusion and false detection.

3D-FCT: Simultaneous 3D Object Detection and Tracking Using Feature Correlation

3D-FCT is presented, a Siamese network architecture that utilizes temporal information to simultaneously perform the related tasks of 3D object detection and tracking and produces high accuracy detections by linking short-term object tracklets into long term tracks based on the predicted tracks.

MV-3DT: Multi-View 3D Object Tracking via Cross-Camera 3D Fusion

  • Computer Science
  • 2022
A cross-camera 3D fusion method that fuses 3D detection results from 7 multiple cameras before association and reduces identity switches significantly, and set a new state of the art among all camera-based methods on the NuScenes 3D tracking benchmark, outperforming previously published methods by 8.7% points.

Interactive Multi-scale Fusion of 2D and 3D Features for Multi-object Tracking

Through multi-scale interactive query and fusion between pixellevel and point-level features, the method can obtain more distinguishing features to improve the performance of multiple object tracking and explore the effectiveness of pre-training on each single modality and fine-tuning on the fusion-based model.

DirectTracker: 3D Multi-Object Tracking Using Direct Image Alignment and Photometric Bundle Adjustment

This work proposes DirectTracker, a framework that effectively combines direct image alignment for the short-term tracking and sliding-window photometric bundle adjustment for 3D object detection, and evaluates 3D tracking using the recently introduced higher-order tracking accuracy (HOTA) metric and the generalized intersection over union sim- ilarity measure.

Semantic Geometric Fusion Multi-object Tracking and Lidar Odometry in Dynamic Environment

A least-squares estimator incorporating semantic 3D bounding boxes and geometric point clouds to achieve accurate and robust tracking of multiple objects and the effectiveness of the proposed semantic geometric fusion multi- object tracking (SGF-MOT) module and the localization accuracy of the MLO system under the public KITTI dataset is evaluated.



JRMOT: A Real-Time 3D Multi-Object Tracker and a New Large-Scale Dataset

JRMOT is a novel 3D MOT system that integrates information from RGB images and 3D point clouds to achieve real-time, state-of-the-art tracking performance and serves as first 3D tracking solution for the authors' benchmark.

Argoverse: 3D Tracking and Forecasting With Rich Maps

Argoverse includes sensor data collected by a fleet of autonomous vehicles in Pittsburgh and Miami as well as 3D tracking annotations, 300k extracted interesting vehicle trajectories, and rich semantic maps, which contain rich geometric and semantic metadata which are not currently available in any public dataset.

Joint self-localization and tracking of generic objects in 3D range data

A new algorithm is proposed that treats both the estimation of the trajectory of a sensor and the detection and tracking of moving objects jointly and has applicability to any type of environment since specific object models are not used at any algorithm stage.

Combined image- and world-space tracking in traffic scenes

This work presents its tracking pipeline as a 3D extension of image-based tracking, which uses world-space 3D information at every stage of processing by combining a novel coupled 2D-3D Kalman filter with a conceptually clean and extendable hypothesize-and-select framework.

Beyond Pixels: Leveraging Geometry and Shape Cues for Online Multi-Object Tracking

This paper introduces geometry and object shape and pose costs for multi-object tracking in urban driving scenarios. Using images from a monocular camera alone, we devise pairwise costs for object

nuScenes: A Multimodal Dataset for Autonomous Driving

Robust detection and tracking of objects is crucial for the deployment of autonomous vehicle technology. Image based benchmark datasets have driven development in computer vision tasks such as object

Joint Monocular 3D Vehicle Detection and Tracking

A novel online framework for 3D vehicle detection and tracking from monocular videos that can not only associate detections of vehicles in motion over time, but also estimate their complete 3D bounding box information from a sequence of 2D images captured on a moving platform.

3D Multi-Object Tracking: A Baseline and New Evaluation Metrics

Surprisingly, although the proposed system does not use any 2D data as inputs, it achieves competitive performance on the KITTI 2D MOT leaderboard and runs at a rate of 207.4 FPS, achieving the fastest speed among all modern MOT systems.

Robust Multi-Modality Multi-Object Tracking

This study designs a generic sensor-agnostic multi-modality MOT framework (mmMOT), where each modality is capable of performing its role independently to preserve reliability, and could further improving its accuracy through a novel multi- modality fusion module.