• Publications
  • Influence
Siamese Instance Search for Tracking
TLDR
In this paper we present a tracker, which is radically different from state-of-the-art trackers: we apply no model updating, no occlusion detection, no combination of trackers, no geometric matching, and still deliver state of theart tracking performance, as demonstrated on the popular online tracking benchmark (OTB). Expand
  • 579
  • 110
  • PDF
The Sixth Visual Object Tracking VOT2018 Challenge Results
TLDR
The Visual Object Tracking challenge VOT2018 is the sixth annual tracker benchmarking activity organized by the VOT initiative. Expand
  • 236
  • 66
  • PDF
Dynamic Image Networks for Action Recognition
TLDR
We introduce the concept of dynamic image, a novel compact representation of videos useful for video analysis especially when convolutional neural networks (CNNs) are used. Expand
  • 387
  • 65
  • PDF
Modeling video evolution for action recognition
TLDR
In this paper we present a method to capture video-wide temporal information for action recognition. Expand
  • 363
  • 34
  • PDF
VideoLSTM convolves, attends and flows for action recognition
TLDR
We present a new architecture for end-to-end sequence learning of actions in video, we call VideoLSTM. Expand
  • 250
  • 34
  • PDF
Action Recognition with Dynamic Image Networks
TLDR
We introduce the concept of dynamic image, a novel compact representation of videos useful for video analysis, particularly in combination with convolutional neural networks. Expand
  • 90
  • 24
  • PDF
Rank Pooling for Action Recognition
TLDR
We propose a function-based temporal pooling method that captures the latent structure of the video sequence data - e.g., how frame-level features evolve over time in a video. Expand
  • 207
  • 22
  • PDF
Online Action Detection
TLDR
In online action detection, the goal is to detect the start of an action in a video stream as soon as it happens. Expand
  • 87
  • 21
  • PDF
COSTA: Co-Occurrence Statistics for Zero-Shot Classification
TLDR
In this paper we aim for zero-shot classification, that is visual recognition of an unseen class by using knowledge transfer from known classes. Expand
  • 199
  • 19
  • PDF
Self-Supervised Video Representation Learning with Odd-One-Out Networks
TLDR
We propose a new self-supervised CNN pre-training technique based on a novel auxiliary task called odd-one-out learning, which generalizes to other related tasks such as action recognition. Expand
  • 193
  • 17
  • PDF