Pascal Mettes

Learn More
This work aims for image categorization using a representation of distinctive parts. Different from existing part-based work, we argue that parts are naturally shared between image categories and should be modeled as such. We motivate our approach with a quantitative and qualitative analysis by backtracking where selected parts come from. Our analysis shows(More)
In this paper we summarize our TRECVID 2014 [12] video retrieval experiments. The MediaMill team participated in five tasks: concept detection, object localization, instance search, event recognition and recounting. We experimented with concept detection using deep learning and color difference coding [17], object localization using FLAIR [23], instance(More)
This paper is concerned with nature conservation by automatically monitoring animal distribution and animal abundance. Typically , such conservation tasks are performed manually on foot or after an aerial recording from a manned aircraft. Such manual approaches are expensive, slow and labor intensive. In this paper, we investigate the combination of small(More)
The goal of this paper is event detection and recounting using a representation of concept detector scores. Different from existing work, which encodes videos by averaging concept scores over all frames, we propose to encode videos using fragments that are discriminatively learned per event. Our <i>bag-of-fragments</i> split a video into semantically(More)
This paper strives for video event detection using a representation learned from deep convolutional neural networks. Different from the leading approaches, who all learn from the 1,000 classes defined in the ImageNet Large Scale Visual Recognition Challenge, we investigate how to leverage the complete ImageNet hierarchy for pre-training deep networks. To(More)
We strive for spatio-temporal localization of actions in videos. The state-of-the-art relies on action proposals at test time and selects the best one with a classifier demanding carefully annotated box annotations at train time. Annotating action boxes in video is cumbersome, tedious, and error prone. Rather than annotating boxes, we propose to annotate(More)
The automatic recognition of water entails a wide range of applications, yet little attention has been paid to solve this specific problem. Current literature generally treats the problem as a part of more general recognition tasks, such as material recognition and dynamic texture recognition, without distinctively analyzing and characterizing the visual(More)
In this work, the merits of class-dependent image feature selection for real-world material classification is investigated. Current state-of-the-art approaches to material classification attempt to discriminate materials based on their surface properties by using a rich set of heterogeneous local features. The primary foundation of these approaches is the(More)
This notebook paper describes our approach for the action classification task of the THUMOS 2015 benchmark challenge. We use two types of representations to capture motion and appearance. For a local motion description we employ HOG, HOF and MBH features, computed along the improved dense trajectories. The motion features are encoded into a fixed-length(More)
In this work, we aim to segment and detect water in videos. Water detection is beneficial for appllications such as video search, outdoor surveillance, and systems such as unmanned ground vehicles and unmanned aerial vehicles. The specific problem, however, is less discussed compared to general texture recognition. Here, we analyze several motion properties(More)