Tracking by Instance Detection: A Meta-Learning Approach

  title={Tracking by Instance Detection: A Meta-Learning Approach},
  author={Guangting Wang and Chong Luo and Xiaoyan Sun and Zhiwei Xiong and Wenjun Zeng},
  journal={2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
We consider the tracking problem as a special type of object detection problem, which we call instance detection. With proper initialization, a detector can be quickly converted into a tracker by learning the new instance from a single image. We find that model-agnostic meta-learning (MAML) offers a strategy to initialize the detector that satisfies our needs. We propose a principled three-step approach to build a high-performance tracker. First, pick any modern object detector trained with… 

Fast and Robust Visual Tracking with Few-Iteration Meta-Learning

This work proposes a meta-learning method based on fast optimization for visual object tracking that performs well on VOT2018 and GOT-10k datasets, and is fast and robust on real-time performance.

Real-Time Visual Object Tracking via Few-Shot Learning

This work proposes a generalized two-stage framework that is capable of employing a large variety of FSL algorithms while presenting faster adaptation speed and systematically investigates several forms of optimization-based few-shot learners from previous works with different objective functions, optimization methods, or solution space.

Saliency-Associated Object Tracking

This paper proposes a fine-grained saliency mining module to capture the local saliencies of the target that are discriminative for tracking and designs a saliency-association modeling module to associate the captured saliencies together to learn effective correlation representations between the exemplar and the search image for state estimation.

Coarse-to-Fine Object Tracking Using Deep Features and Correlation Filters

This work forms the tracking task as a two-stage procedure, exploiting the generalization ability of deep features to coarsely estimate target translation, while ensuring invariance to appearance change, and designed an update control mechanism to learn appearance change while avoiding model drift.

Visual Tracking by Adaptive Continual Meta-Learning

This work forms the visual tracking problem as a semi-supervised continual learning problem, where only an initial frame is labeled, and dynamically generates the hyperparameters needed for fast initialization and online update to achieve more robustness via adaptively regulating the learning process.

Unified Transformer Tracker for Object Tracking

This work presents the Unified Transformer Tracker (UTT), a track transformer is developed in this UTT to track the target in both SOT and MOT where the correlation between the target feature and the tracking frame feature is exploited to localize the target.

TSTrack: Tracking by Detector with Target Guidance and Self-Attention*

A novel object tracking framework based on object detection with self-attention mechanism is proposed, which dynamically self-learned in two dimensions of space and channel to obtain a more robust object representation.

Explicitly Modeling the Discriminability for Instance-Aware Visual Object Tracking

A novel Instance-Aware Tracker to explicitly excavate the discriminability of feature representations, which improves the classical visual tracking pipeline with an instance-level classifier and introduces a contrastive learning mechanism to formulate the classification task.

Learning Target Candidate Association to Keep Track of What Not to Track

This work proposes a training strategy that combines partial annotations with self-supervision to keep track of distractor objects in order to continue tracking the target, and introduces a learned association network to propagate the identities of all target candidates from frame-to-frame.

Do We Really Need Frame-by-Frame Annotation Datasets for Object Tracking?

The results imply complete video annotation might not be necessary for object tracking if leveraging motion-driven data augmentations during training, and present AMMC (Augmentation by Mimicking Motion Change), a data augmentation strategy that enables learning high-performing trackers using small-scale datasets.



Bridging the Gap Between Detection and Tracking: A Unified Approach

This paper aims to explore a general framework for building trackers directly upon almost any advanced object detector, and introduces an anchored updating strategy to alleviate the problem of overfitting.

Meta-Tracker: Fast and Robust Online Adaptation for Visual Object Trackers

This paper improves state-of-the-art visual object trackers that use online adaptation by using an offline meta-learning-based method to adjust the initial deep networks used in online adaptation-based tracking.

TrackingNet: A Large-Scale Dataset and Benchmark for Object Tracking in the Wild

This work presents TrackingNet, the first large-scale dataset and benchmark for object tracking in the wild, which covers a wide selection of object classes in broad and diverse context and provides an extensive benchmark on TrackingNet by evaluating more than 20 trackers.

Fully-Convolutional Siamese Networks for Object Tracking

A basic tracking algorithm is equipped with a novel fully-convolutional Siamese network trained end-to-end on the ILSVRC15 dataset for object detection in video and achieves state-of-the-art performance in multiple benchmarks.

Deep Meta Learning for Real-Time Target-Aware Visual Tracking

A novel on-line visual tracking framework based on the Siamese matching network and meta-learner network that performs at a real-time speed while maintaining competitive performance among other state-of-the-art tracking algorithms.

ATOM: Accurate Tracking by Overlap Maximization

This work proposes a novel tracking architecture, consisting of dedicated target estimation and classification components, and introduces a classification component that is trained online to guarantee high discriminative power in the presence of distractors.

High Performance Visual Tracking with Siamese Region Proposal Network

The Siamese region proposal network (Siamese-RPN) is proposed which is end-to-end trained off-line with large-scale image pairs for visual object tracking and consists of SiAMESe subnetwork for feature extraction and region proposal subnetwork including the classification branch and regression branch.

Learning Discriminative Model Prediction for Tracking

An end-to-end tracking architecture, capable of fully exploiting both target and background appearance information for target model prediction, derived from a discriminative learning loss by designing a dedicated optimization process that is capable of predicting a powerful model in only a few iterations.

CREST: Convolutional Residual Learning for Visual Tracking

This paper proposes the CREST algorithm to reformulate DCFs as a one-layer convolutional neural network, and applies residual learning to take appearance changes into account to reduce model degradation during online update.

FCOS: Fully Convolutional One-Stage Object Detection

For the first time, a much simpler and flexible detection framework achieving improved detection accuracy is demonstrated, and it is hoped that the proposed FCOS framework can serve as a simple and strong alternative for many other instance-level tasks.