Multi-Modal Fusion for End-to-End RGB-T Tracking

@article{Zhang2019MultiModalFF,
  title={Multi-Modal Fusion for End-to-End RGB-T Tracking},
  author={Lichao Zhang and Martin Danelljan and Abel Gonzalez-Garcia and Joost van de Weijer and Fahad Shahbaz Khan},
  journal={2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)},
  year={2019},
  pages={2252-2261}
}
We propose an end-to-end tracking framework for fusing the RGB and TIR modalities in RGB-T tracking. Our baseline tracker is DiMP (Discriminative Model Prediction), which employs a carefully designed target prediction network trained end-to-end using a discriminative loss. We analyze the effectiveness of modality fusion in each of the main components in DiMP, i.e. feature extractor, target estimation network, and classifier. We consider several fusion mechanisms acting at different levels of… Expand
Challenge-Aware RGBT Tracking
TLDR
A novel challenge-aware neural network to handle the modality-shared challenges and the modalities-specific ones for RGBT tracking and proposes a guidance module to transfer discriminative features from one modality to another one, which could enhance the discrim inative ability of some weak modality. Expand
Channel Exchanging for RGB-T Tracking
  • Long Zhao, Meng Zhu, Honge Ren, Lingjixuan Xue
  • Computer Science, Medicine
  • Sensors
  • 2021
TLDR
DiMP is used as the baseline tracker to design an RGB-T object tracking framework channel exchanging DiMP (CEDiMP) based on channel exchanging that achieves dynamic channel exchanging between sub-networks of different modes hardly adding any parameters during the feature fusion process. Expand
MFGNet: Dynamic Modality-Aware Filter Generation for RGB-T Tracking
  • Xiao Wang, Xiu Shu, +4 authors Feng Wu
  • Computer Science
  • ArXiv
  • 2021
TLDR
A new dynamic modality-aware filter generation module (named MFGNet) is proposed to boost the message communication between visible and thermal data by adaptively adjusting the convolutional kernels for various input images in practical tracking. Expand
Jointly Modeling Motion and Appearance Cues for Robust RGB-T Tracking
TLDR
This study develops a novel late fusion method to infer the fusion weight maps of both RGB and thermal (T) modalities and proposes a tracker switcher to switch the appearance and motion trackers flexibly. Expand
Multi-modal Visual Tracking: Review and Experimental Comparison
TLDR
The multi-modal tracking algorithms, especially visible-depth ( RGB-D) tracking and visible-thermal (RGB-T) tracking in a unified taxonomy from different aspects are summarized. Expand
WF_DiMP: weight-aware dual-modal feature aggregation mechanism for RGB-T tracking
Visual object tracking has attracted a lot of interests due to its applications in numerous fields such as industry and security. Because the change of illumination could lead to RGB trackingExpand
Object Tracking by Jointly Exploiting Frame and Event Domain
TLDR
This work proposes a multi-modal based approach to fuse visual cues from the frameand event-domain to enhance the single object tracking performance, especially in degraded conditions. Expand
M5L: Multi-Modal Multi-Margin Metric Learning for RGBT Tracking
TLDR
A novel Multi-Modal Multi-Margin Metric Learning framework, named M$^5$L for RGBT tracking, which design a multi-margin structured loss to distinguish the confusing samples which play a most critical role in tracking performance boosting. Expand
LasHeR: A Large-scale High-diversity Benchmark for RGBT Tracking
TLDR
A large-scale High-diversity benchmark for RGBT tracking (LasHeR) is presented and the unaligned version of LasHeR is released to attract the research interest for alignment-free RGB tracking, which is a more practical task in real-world applications. Expand
The Seventh Visual Object Tracking VOT2019 Challenge Results
The Visual Object Tracking challenge VOT2019 is the seventh annual tracker benchmarking activity organized by the VOT initiative. Results of 81 trackers are presented; many are state-of-the-artExpand
...
1
2
...

References

SHOWING 1-10 OF 60 REFERENCES
Synthetic Data Generation for End-to-End Thermal Infrared Tracking
TLDR
To the best of the knowledge, this work is the first to train end-to-end features for TIR tracking, and shows that a network trained on a large dataset of synthetic TIR data obtains better performance than one trained on the available real T IR data. Expand
RGB-T Object Tracking: Benchmark and Baseline
TLDR
A novel graph-based approach to learn a robust object representation for RGB-T tracking is proposed, in which the tracked object is represented with a graph with image patches as nodes, dynamically learned in a unified ADMM (alternating direction method of multipliers)-based optimization framework. Expand
Cross-Modal Ranking with Soft Consistency and Noisy Labels for Robust RGB-T Tracking
TLDR
A novel approach to suppress background effects for RGB-T tracking by integrating the soft cross-modality consistency into the ranking model which allows the sparse inconsistency to account for the different properties between these two modalities. Expand
Learning Discriminative Model Prediction for Tracking
TLDR
An end-to-end tracking architecture, capable of fully exploiting both target and background appearance information for target model prediction, derived from a discriminative learning loss by designing a dedicated optimization process that is capable of predicting a powerful model in only a few iterations. Expand
High Performance Visual Tracking with Siamese Region Proposal Network
TLDR
The Siamese region proposal network (Siamese-RPN) is proposed which is end-to-end trained off-line with large-scale image pairs for visual object tracking and consists of SiAMESe subnetwork for feature extraction and region proposal subnetwork including the classification branch and regression branch. Expand
End-to-End Representation Learning for Correlation Filter Based Tracking
TLDR
This work is the first to overcome this limitation by interpreting the Correlation Filter learner, which has a closed-form solution, as a differentiable layer in a deep neural network, which enables learning deep features that are tightly coupled to the Cor correlation filter. Expand
Learning Collaborative Sparse Representation for Grayscale-Thermal Tracking
TLDR
An adaptive fusion scheme is proposed based on collaborative sparse representation in Bayesian filtering framework and jointly optimize sparse codes and the reliable weights of different modalities in an online way to perform robust object tracking in challenging scenarios. Expand
Convolutional Features for Correlation Filter Based Visual Tracking
TLDR
The results suggest that activations from the first layer provide superior tracking performance compared to the deeper layers, and show that the convolutional features provide improved results compared to standard hand-crafted features. Expand
Adaptive Decontamination of the Training Set: A Unified Formulation for Discriminative Visual Tracking
TLDR
This work proposes a novel generic approach for alleviating the problem of corrupted training samples in tracking-by-detection frameworks by minimizing a single loss over both the target appearance model and the sample quality weights. Expand
Meta-Tracker: Fast and Robust Online Adaptation for Visual Object Trackers
TLDR
This paper improves state-of-the-art visual object trackers that use online adaptation by using an offline meta-learning-based method to adjust the initial deep networks used in online adaptation-based tracking. Expand
...
1
2
3
4
5
...