On Detection, Data Association and Segmentation for Multi-Target Tracking

  title={On Detection, Data Association and Segmentation for Multi-Target Tracking},
  author={Yicong Tian and Afshin Dehghan and Mubarak Shah},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
In this work, we propose a tracker that differs from most existing multi-target trackers in two major ways. [] Key Method The proposed algorithm consists of two main components: structured learning and Lagrange dual decomposition. Our structured learning based tracker learns a model for each target and infers the best locations of all targets simultaneously in a video clip. The inference of our structured learning is achieved through a new Target Identity-aware Network Flow (TINF), where each node in the…

Deep Affinity Network for Multiple Object Tracking

The proposed Deep Affinity Network (DAN) learns compact, yet comprehensive features of pre-detected objects at several levels of abstraction, and performs exhaustive pairing permutations of those features in any two frames to infer object affinities.

Simultaneous Detection and Tracking with Motion Modelling for Multiple Object Tracking

This work introduces Deep Motion Modeling Network (DMM-Net) that can estimate multiple objects' motion parameters to perform joint detection and association in an end-to-end manner and demonstrates the suitability of Omni-MOT for deep learning with DMMNet and makes the source code of the network public.

Multiple Object Tracking in Deep Learning Approaches: A Survey

This paper focuses on giving a thorough review of the evolution of MOT in recent decades, investigating the recent advances in MOT, and showing some potential directions for future work.

Transformer-based assignment decision network for multiple object tracking

Transformer-based Assignment Decision Network (TADN) is introduced that tackles data association without the need of any explicit optimization during inference and can directly infer assignment pairs between detections and active targets in a single forward pass of the network.

Spatial–Semantic and Temporal Attention Mechanism-Based Online Multi-Object Tracking

A more efficient and effective spatial–temporal attention scheme to track multiple objects in various scenarios using a semantic-feature-based spatial attention mechanism and a novel Motion Model to address the insertion and location of candidates.

Multi-object tracking based on network flow model and ORB feature

This work extracts the ORB features from the detection response and match them to achieve data association at the intra-tracklet stage and demonstrates that compared with other state-of-art algorithms, this method improves tracking performance in complex environments.

Multiple Object Tracking in Recent Times: A Literature Review

This review has tried to show the different perspectives of techniques that researchers used overtimes to solve the problems of MOT, and give some future direction for the potential researchers.

Multi-Scale Person Localization With Multi-Stage Deep Sequential Framework

A head detection framework that estimates the scales of person’s heads, as it is argued that head is the only visible part in complex scenes and achieves state-of-the-art performance.

Looking Beyond Two Frames: End-to-End Multi-Object Tracking Using Spatial and Temporal Transformers

MO3TR is presented: a truly end-to-end Transformer-based online multi-object tracking (MOT) framework that learns to handle occlusions, track initiation and termination without the need for an explicit data association module or any heuristics.

Reinforcement Learning-Based Data Association for Multiple Target Tracking in Clutter

A novel data association method based on reinforcement learning (RL), i.e., the so-called RL-JPDA method, has been proposed for solving the problem of multiple target tracking, and shows that the proposed method yields a shorter execution time compared to other methods.



Joint tracking and segmentation of multiple targets

This work proposes a multi-target tracker that exploits low level image information and associates every (super)-pixel to a specific target or classifies it as background and obtains a video segmentation in addition to the classical bounding-box representation in unconstrained, real-world videos.

Target Identity-aware Network Flow for online multiple target tracking

It is shown that automatically detecting and tracking targets in a single framework can help resolve the ambiguities due to frequent occlusion and heavy articulation of targets.

Coupling detection and data association for multiple object tracking

A novel framework for multiple object tracking in which the problems of object detection and data association are expressed by a single objective function, which follows the Lagrange dual decomposition strategy.


  • Zdenek Kalal
  • Computer Science
    IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 2012
A novel tracking framework (TLD) that explicitly decomposes the long-term tracking task into tracking, learning, and detection, and develops a novel learning method (P-N learning) which estimates the errors by a pair of “experts”: P-expert estimates missed detections, and N-ex Expert estimates false alarms.

Subgraph decomposition for multi-target tracking

This work state multi-target tracking as a Minimum Cost Subgraph Multicut Problem and proposes to link and cluster plausible detections jointly across space and time to facilitate long-range re-identification and within-frame clustering.

Struck: Structured Output Tracking with Kernels

A framework for adaptive visual object tracking based on structured output prediction that is able to outperform state-of-the-art trackers on various benchmark videos and can easily incorporate additional features and kernels into the framework, which results in increased tracking performance.

Superpixel tracking

This paper presents a discriminative appearance model based on superpixels, thereby facilitating a tracker to distinguish the target and the background with mid-level cues and is shown to perform favorably against existing methods for object tracking.

Multi-target tracking by on-line learned discriminative appearance models

OLDAMs have significantly higher discrimination between different targets than conventional holistic color histograms, and when integrated into a hierarchical association framework, they help improve the tracking accuracy, particularly reducing the false alarms and identity switches.

Multi-person Tracking with Sparse Detection and Continuous Segmentation

This paper presents an integrated framework for mobile street-level tracking of multiple persons that employs an efficient level-set tracker in order to follow individual pedestrians over time and requires the pedestrian detector to be active only part of the time, resulting in computational savings.

Robust Online Multi-object Tracking Based on Tracklet Confidence and Online Discriminative Appearance Learning

This paper proposes a robust online multi-object tracking method that can handle frequent occlusion by clutter or other objects, and proposes a novel online learning method using an incremental linear discriminant analysis for discriminating the appearances of objects.