SiamCAR: Siamese Fully Convolutional Classification and Regression for Visual Tracking

@article{Guo2020SiamCARSF,
  title={SiamCAR: Siamese Fully Convolutional Classification and Regression for Visual Tracking},
  author={Dongyan Guo and Jun Wang and Ying Cui and Zhenhua Wang and Shengyong Chen},
  journal={2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2020},
  pages={6268-6276}
}
By decomposing the visual tracking task into two subproblems as classification for pixel category and regression for object bounding box at this pixel, we propose a novel fully convolutional Siamese network to solve visual tracking end-to-end in a per-pixel manner. The proposed framework SiamCAR consists of two simple subnetworks: one Siamese subnetwork for feature extraction and one classification-regression subnetwork for bounding box prediction. Different from state-of-the-art trackers like… 

Figures and Tables from this paper

SiamCAN: Real-Time Visual Tracking Based on Siamese Center-Aware Network

TLDR
A novelSiamese center-aware network (SiamCAN) for visual tracking, which consists of the Siamese feature extraction subnetwork, followed by the classification, regression, and localization branches in parallel, which achieves leading accuracy with high efficiency.

SiamPCF: siamese point regression with coarse-fine classification network for visual tracking

TLDR
This work proposes the SiamPCF, which is an anchor-free tracking method that avoids the carefully selected hyperparameters needed to design anchors, and uses points adaptively positioned on the target to describe the target and transform these points to bounding boxes.

Improved Siamese classification and regression adaptive network for visual tracking

TLDR
This work proposes a novel visual tracking framework named ISiamCRAN, which uses modified ResNet-50 as the backbone network and uses an elliptical sample label assignment strategy to replace traditional strategy, and can more accurately distinguish the foreground and background.

Visual Object Tracking with Discriminative Filters and Siamese Networks: A Survey and Outlook

TLDR
This survey presents a systematic and thorough review of more than 90 DCFs and Siamese trackers, based on results in nine tracking benchmarks, and distinguishes and comprehensively review the shared as well as specific open research challenges in both these tracking paradigms.

RPT: Learning Point Set Representation for Siamese Visual Tracking

TLDR
This paper argues that this issue is closely related to the prevalent bounding box representation, which provides only a coarse spatial extent of object, and proposes an effcient visual tracking framework to accurately estimate the target state with a finer representation as a set of representative points.

Pyramid Correlation based Deep Hough Voting for Visual Object Tracking

TLDR
A novel voting-based classification-only tracking algorithm named Pyramid Correlation based Deep Hough Voting (short for PCDHV) is introduced, to jointly locate the top-left and bottom-right corners of the target.

Template Enhancement and Mask Generation for Siamese Tracking

TLDR
This work proposes constructing an alternative template explicitly to address the underfitting of the instance space, and obtains the descriptor aggregation to transform the semantic segmentation outputs for mask prediction in Siamese tracking.

Robust Template Adjustment Siamese Network for Object Visual Tracking

TLDR
A novel Template Adjustment Siamese Network (TA-Siam) is proposed, which makes the template adapt to the target appearance variation of long-term sequence and effectively overcomes model drift problem ofSiamese networks.

SiamMask: A Framework for Fast Online Object Tracking and Segmentation

TLDR
This paper improves the offline training procedure of popular fully-convolutional Siamese approaches by augmenting their losses with a binary segmentation task, and introduces SiamMask, a framework to perform both visual object tracking and video object segmentation, in real-time, with the same simple method.

SiamCPN: Visual tracking with the Siamese center-prediction network

TLDR
This study proposes a new anchor-free network, the Siamese center-prediction network (SiamCPN), which directly predict the center point and size of the object in subsequent frames in aSiamese-structure network without the need for perframe post-processing operations.
...

References

SHOWING 1-10 OF 42 REFERENCES

SiamRPN++: Evolution of Siamese Visual Tracking With Very Deep Networks

TLDR
This work proves the core reason Siamese trackers still have accuracy gap comes from the lack of strict translation invariance, and proposes a new model architecture to perform depth-wise and layer-wise aggregations, which not only improves the accuracy but also reduces the model size.

Triplet Loss in Siamese Network for Object Tracking

TLDR
A novel triplet loss is proposed to extract expressive deep feature for object tracking by adding it into Siamese network framework instead of pairwise loss for training.

Learning Dynamic Siamese Network for Visual Object Tracking

TLDR
This paper proposes dynamic Siamese network, via a fast transformation learning model that enables effective online learning of target appearance variation and background suppression from previous frames, and presents elementwise multi-layer fusion to adaptively integrate the network outputs using multi-level deep features.

Siamese Cascaded Region Proposal Networks for Real-Time Visual Tracking

  • Heng FanHaibin Ling
  • Computer Science
    2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2019
TLDR
A multi-stage tracking framework, Siamese Cascaded RPN (C-RPN), which consists of a sequence of RPNs cascaded from deep high-level to shallow low-level layers in aSiamese network, which achieves state-of-the-art results and runs in real-time.

A Twofold Siamese Network for Real-Time Object Tracking

TLDR
The proposed SA-Siam outperforms all other real-time trackers by a large margin on OTB-2013/50/100 benchmarks and proposes a channel attention mechanism for the semantic branch.

Distractor-aware Siamese Networks for Visual Object Tracking

TLDR
This paper focuses on learning distractor-aware Siamese networks for accurate and long-term tracking, and extends the proposed approach for long- term tracking by introducing a simple yet effective local-to-global search region strategy.

Fully-Convolutional Siamese Networks for Object Tracking

TLDR
A basic tracking algorithm is equipped with a novel fully-convolutional Siamese network trained end-to-end on the ILSVRC15 dataset for object detection in video and achieves state-of-the-art performance in multiple benchmarks.

Graph Convolutional Tracking

TLDR
The GCT jointly incorporates two types of Graph Convolutional Networks into a siamese framework for target appearance modeling and adopts a spatial-temporal GCN to model the structured representation of historical target exemplars.

CREST: Convolutional Residual Learning for Visual Tracking

TLDR
This paper proposes the CREST algorithm to reformulate DCFs as a one-layer convolutional neural network, and applies residual learning to take appearance changes into account to reduce model degradation during online update.

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

TLDR
This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.