Ken Chatfield

Learn More
The latest generation of Convolutional Neural Networks (CNN) have achieved impressive results in challenging benchmarks on image recognition and object detection, significantly raising the interest of the community in these methods. Nevertheless, it is still unclear how different CNN methods compare with each other and with previous state-of-the-art shallow(More)
[1] vgg/research/encoding eval. [2] J. Sivic and A. Zisserman. Proc. ICCV, 2003. [3] J. Philbin et al. Proc. CVPR, 2008. [4] J. C. van Gemert et al. Proc. ECCV, 2008. [5] J. Wang et al. Proc. CVPR, 2010. [6] F. Perronnin et al. Proc. ECCV, 2010. [7] X. Zhou et al. Proc. ECCV, 2010. Histogram (VQ) 8, 000 74.23± 0.65 Kernel Codebook (KCB)(More)
We present an efficient object retrieval system based on the identification of abstract deformable ‘shape’ classes using the self-similarity descriptor of Shechtman and Irani [13]. Given a user-specified query object, we retrieve other images which share a common ‘shape’ even if their appearance differs greatly in terms of(More)
The objective of this work is to visually search large-scale video datasets for semantic entities specified by a text query. The paradigm we explore is constructing visual models for such semantic entities on-the-fly, i.e. at run time, by using an image search engine to source visual training data for the text query. The approach combines fast and accurate(More)
We investigate the gains in precision and speed, that can be obtained by using Convolutional Networks (ConvNets) for on-the-fly retrieval – where classifiers are learnt at run time for a textual query from downloaded images, and used to rank large image or video datasets. We make three contributions: (i) we present an evaluation of state-ofthe-art image(More)
The EU FP7 project AXES aims at better understanding the needs of archive users and supporting them with systems that reach beyond the state-of-the-art. Our system allows users to instantaneously retrieve content using metadata, spoken words, or a vocabulary of reliably detected visual concepts comprising places, objects and events. Additionally, users can(More)