Yannis S. Avrithis

Learn More
This paper considers a family of metrics to compare images based on their local descriptors. It encompasses the VLAD descriptor and matching techniques such as Hamming Embedding. Making the bridge between these approaches leads us to propose a match kernel that takes the best of existing techniques by combining an aggregation procedure with a selective(More)
We propose a scalable logo recognition approach that extends the common bag-of-words model and incorporates local geometry in the indexing process. Given a query image and a large logo database, the goal is to recognize the logo contained in the query, if any. We locally group features in triples using multi-scale Delaunay triangulation and represent(More)
We present a simple vector quantizer that combines low distortion with fast search and apply it to approximate nearest neighbor (ANN) search in high dimensional spaces. Leveraging the very same data structure that is used to provide non-exhaustive search, i.e., inverted lists or a multi-index, the idea is to locally optimize an individual product quantizer(More)
Exploiting local feature shape has made geometry indexing possible, but at a high cost of index space, while a sequential spatial verification and re-ranking stage is still indispensable for large scale image retrieval. In this work we investigate an accelerated approach for the latter problem. We develop a simple spatial matching model inspired by Hough(More)
Annotations of multimedia documents typically have been pursued in two different directions. Either previous approaches have focused on low level descriptors, such as dominant color, or they have focused on the content dimension and corresponding annotations, such as person or vehicle. In this paper, we present a software environment to bridge between the(More)
This paper considers a family of metrics to compare images based on their local descriptors. It encompasses the vector or locally aggregated descriptors descriptor and matching techniques such as hamming embedding. Making the bridge between these approaches leads us to propose a match kernel that takes the best of existing techniques by combining an(More)
State of the art data mining and image retrieval in community photo collections typically focus on popular subsets, e.g. images containing landmarks or associated to Wikipedia articles. We propose an image clustering scheme that, seen as vector quantization compresses a large corpus of images by grouping visually consistent ones while providing a guaranteed(More)
In this paper we present the construction of an ontology that represents the structure of the MPEG-7 visual part. The goal of this ontology is to enable machines to generate and understand visual descriptions which can be used for multimedia reasoning. Within the specification, MPEG-7 definitions (description schemes and descriptors) are expressed in XML(More)
Several spatiotemporal feature point detectors have been used in video analysis for action recognition. Feature points are detected using a number of measures, namely saliency, cornerness, periodicity, motion activity etc. Each of these measures is usually intensity-based and provides a different trade-off between density and informativeness. In this paper,(More)
A framework for video content representation is proposed in this paper for extracting limited, but meaningful, information of video data directly from MPEG compressed domain. First, the traditional frame-based representation is transformed to a feature-based one. Then, all features are gathered together using a fuzzy formulation and extraction of several(More)