Is First Person Vision Challenging for Object Tracking?

  title={Is First Person Vision Challenging for Object Tracking?},
  author={Matteo Dunnhofer and Antonino Furnari and Giovanni Maria Farinella and Christian Micheloni},
  journal={2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)},
Understanding human-object interactions is fundamental in First Person Vision (FPV). Tracking algorithms which follow the objects manipulated by the camera wearer can provide useful cues to effectively model such interactions. Visual tracking solutions available in the computer vision literature have significantly improved their performance in the last years for a large variety of target objects and tracking scenarios. However, despite a few previous attempts to exploit trackers in FPV… 

Figures and Tables from this paper

Egocentric Prediction of Action Target in 3D
A large multimodality dataset is proposed of more than 1 million frames of RGB-D and IMU streams and evaluation metrics based on high-quality 2D and 3D labels from semi-automatic annotation are provided, demonstrating that this new task is worthy of further study by researchers in robotics, vision, and learning communities.
CoCoLoT: Combining Complementary Trackers in Long-Term Visual Tracking
CoLoT perceives whether the trackers are following the target object through an online learned deep verification model, and accordingly activates a decision policy which selects the best performing tracker as well as it corrects the performance of the failing one.
Predictive Visual Tracking: A New Benchmark and Baseline Approach
A new predictive visual tracking baseline is developed to compensate for the latency stemming from the onboard computation and can provide a more realistic evaluation of the trackers for the robotic applications.


NUS-PRO: A New Visual Tracking Challenge
A thorough experimental evaluation of 20 state-of-the-art tracking algorithms is presented with detailed analysis using different metrics and a large-scale database which contains 365 challenging image sequences of pedestrians and rigid objects is proposed.
Object Tracking Benchmark
An extensive evaluation of the state-of-the-art online object-tracking algorithms with various evaluation criteria is carried out to identify effective approaches for robust tracking and provide potential future research directions in this field.
Meta-Tracker: Fast and Robust Online Adaptation for Visual Object Trackers
This paper improves state-of-the-art visual object trackers that use online adaptation by using an offline meta-learning-based method to adjust the initial deep networks used in online adaptation-based tracking.
D3S – A Discriminative Single Shot Segmentation Tracker
Without per-dataset finetuning and trained only for segmentation as the primary output, D3S outperforms all trackers on VOT2016, VOT2018 and GOT-10k benchmarks and performs close to the state-of-the-artTrackers on the TrackingNet.
Need for Speed: A Benchmark for Higher Frame Rate Object Tracking
This paper proposes the first higher frame rate video dataset (called Need for Speed - NfS) and benchmark for visual object tracking and finds that at higher frame rates, simple trackers such as correlation filters outperform complex methods based on deep networks.
Know Your Surroundings: Exploiting Scene Information for Object Tracking
This work proposes a novel tracking architecture which can utilize scene information as dense localized state vectors, which can encode, for example, if the local region is target, background, or distractor and combined with the appearance model output to localize the target.
Online Object Tracking: A Benchmark
Large scale experiments are carried out with various evaluation criteria to identify effective approaches for robust tracking and provide potential future research directions in this field.
Egocentric Object Tracking: An Odometry-Based Solution
This paper shows how current state-of-the-art visual tracking algorithms fail if challenged with a first-person sequence recorded from a wearable camera attached to a moving user, and proposes a novel approach based on visual odometry and 3D localization that overcomes many issues typical of egocentric vision.
Struck: Structured Output Tracking with Kernels
A framework for adaptive visual object tracking based on structured output prediction that is able to outperform state-of-the-art trackers on various benchmark videos and can easily incorporate additional features and kernels into the framework, which results in increased tracking performance.
The Visual Object Tracking VOT2015 Challenge Results
The Visual Object Tracking challenge 2015, VOT2015, aims at comparing short-term single-object visual trackers that do not apply pre-learned models of object appearance and presents a new VOT 2015 dataset twice as large as in VOT2014 with full annotation of targets by rotated bounding boxes and per-frame attribute.