FEAR: Fast, Efficient, Accurate and Robust Visual Tracker

  title={FEAR: Fast, Efficient, Accurate and Robust Visual Tracker},
  author={Vasyl Borsuk and Roman Vei and Orest Kupyn and T. Martyniuk and Igor Krashenyi and Jivri Matas},
, Abstract. We present FEAR, a family of f ast, e fficient, a ccurate, and r obust Siamese visual trackers. We present a novel and efficient way to benefit from dual-template representation for object model adaption, which incorporates temporal information with only a single learnable parameter. We further improve the tracker architecture with a pixel-wise fusion block. By plugging-in sophisticated backbones with the abovementioned modules, FEAR-M and FEAR-L trackers surpass most Siamese… 

Robust visual tracking using very deep generative model

This study investigated the role of increasing the number of fully connected (FC) layers in adversarial generative networks and their impact on robustness and used a very deep FC network with 22 layers as a high-performance generator for the first time.

On designing light-weight object trackers through network pruning: Use CNNs or transformers?

This paper demonstrates how highly compressed light-weight object trackers can be designed using neural architectural pruning of large CNN and transformer based trackers and provides deeper insights into designing highly efficient trackers from existing SOTA methods.

SRRT: Search Region Regulation Tracking

A novel tracking paradigm is proposed, called Search Region Reg- ulation Tracking (SRRT), which applies a proposed search region regulator to estimate an optimal search region dynam- ically for every frame to adapt the object’s appearance variation during tracking.



Siamese Instance Search for Tracking

It turns out that the learned matching function is so powerful that a simple tracker built upon it, coined Siamese INstance search Tracker, SINT, suffices to reach state-of-the-art performance.

Distractor-aware Siamese Networks for Visual Object Tracking

This paper focuses on learning distractor-aware Siamese networks for accurate and long-term tracking, and extends the proposed approach for long- term tracking by introducing a simple yet effective local-to-global search region strategy.

SiamRPN++: Evolution of Siamese Visual Tracking With Very Deep Networks

This work proves the core reason Siamese trackers still have accuracy gap comes from the lack of strict translation invariance, and proposes a new model architecture to perform depth-wise and layer-wise aggregations, which not only improves the accuracy but also reduces the model size.

ATOM: Accurate Tracking by Overlap Maximization

This work proposes a novel tracking architecture, consisting of dedicated target estimation and classification components, and introduces a classification component that is trained online to guarantee high discriminative power in the presence of distractors.

Deeper and Wider Siamese Networks for Real-Time Visual Tracking

This paper proposes new residual modules to eliminate the negative impact of padding, and designs new architectures using these modules with controlled receptive field size and network stride that guarantee real-time tracking speed when applied to SiamFC and SiamRPN.

RPT: Learning Point Set Representation for Siamese Visual Tracking

This paper argues that this issue is closely related to the prevalent bounding box representation, which provides only a coarse spatial extent of object, and proposes an effcient visual tracking framework to accurately estimate the target state with a finer representation as a set of representative points.

Deep Learning for Visual Tracking: A Comprehensive Survey

This survey aims to systematically investigate the current DL-based visual tracking methods, benchmark datasets, and evaluation metrics, and extensively evaluates and analyzes the leading visualtracking methods.

High Performance Visual Tracking with Siamese Region Proposal Network

The Siamese region proposal network (Siamese-RPN) is proposed which is end-to-end trained off-line with large-scale image pairs for visual object tracking and consists of SiAMESe subnetwork for feature extraction and region proposal subnetwork including the classification branch and regression branch.

LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search

This work presents LightTrack, which uses neural architecture search (NAS) to design more lightweight and efficient object trackers that achieve superior performance compared to handcrafted SOTA trackers, such as SiamRPN++ and Ocean, while using much fewer model Flops and parameters.

Learning Spatio-Temporal Transformer for Visual Tracking

A new tracking architecture with an encoder-decoder transformer as the key component, which models the global spatio-temporal feature dependencies between target objects and search regions, while the decoder learns a query embedding to predict the spatial positions of the target objects.