• Publications
  • Influence
Evaluating Color Descriptors for Object and Scene Recognition
From the theoretical and experimental results, it can be derived that invariance to light intensity changes and light color changes affects category recognition and the usefulness of invariance is category-specific. Expand
The challenge problem for automated detection of 101 semantic concepts in multimedia
We introduce the challenge problem for generic video indexing to gain insight in intermediate steps that affect performance of multimedia analysis methods, while at the same time fosteringExpand
Learning Social Tag Relevance by Neighbor Voting
This paper proposes a neighbor voting algorithm which accurately and efficiently learns tag relevance by accumulating votes from visual neighbors and proves that the algorithm is a good tag relevance measurement for both image ranking and tag ranking. Expand
Early versus late fusion in semantic video analysis
It is shown by experiment on 184 hours of broadcast video data and for 20 semantic concepts, that late fusion tends to give slightly better performance for most concepts, however, for those concepts where early fusion performs better the difference is more significant. Expand
Action Localization with Tubelets from Motion
This paper introduces a sampling strategy to produce 2D+t sequences of bounding boxes, called tubelets, that significantly outperforms the state-of-the-art on both datasets, while restricting the search of actions to a fraction of possible bounding box sequences. Expand
VideoLSTM convolves, attends and flows for action recognition
This work presents a new architecture for end-to-end sequence learning of actions in video, called VideoLSTM, and introduces motion-based attention, which can be used for action localization by relying on just the action class label. Expand
APT: Action localization proposals from dense trajectories
This paper proposes bypassing the segmentation step of existing proposals completely by generating proposals directly from the dense trajectories used to represent videos during classification, using an efficient proposal generation algorithm to handle the high number of trajectories in a video. Expand
Multimodal Video Indexing: A Review of the State-of-the-art
A unifying and multimodal framework is put forward, which views a video document from the perspective of its author, which forms the guiding principle for identifying index types, for which automatic methods are found in literature. Expand
Concept-Based Video Retrieval
This paper presents a component-wise decomposition of such an interdisciplinary multimedia system, covering influences from information retrieval, computer vision, machine learning, and human–computer interaction and lays down the anatomy of a concept-based video search engine. Expand
Online Action Detection
A realistic dataset composed of 27 episodes from 6 popular TV series and an evaluation protocol for fair comparison is introduced, showing this is a challenging problem for which none of the methods provides a good solution. Expand