• Publications
  • Influence
Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary
TLDR
We describe a model of object recognition as machine translation. Expand
  • 1,744
  • 195
  • PDF
Matching Words and Pictures
TLDR
We present a new approach for modeling multi-modal data sets, focusing on the specific case of segmented images with associated text. Expand
  • 1,732
  • 91
  • PDF
Automatic multimedia cross-modal correlation discovery
TLDR
We propose a novel, graph-based approach, "MMG", to discover cross-modal correlations in multimedia objects, which can be applied to any multimedia collection, as long as we have a similarity function for each medium. Expand
  • 461
  • 48
  • PDF
Automatic image captioning
We examine the problem of automatic image captioning. Given a training set of captioned images, we want to discover correlations between image features and keywords, so that we can automatically findExpand
  • 111
  • 11
  • PDF
Recognizing actions from still images
TLDR
In this paper, we approach the problem of understanding human actions from still images. Expand
  • 98
  • 11
  • PDF
A Graph Based Approach for Naming Faces in News Photos
  • D. Ozkan, P. D. Sahin
  • Computer Science
  • IEEE Computer Society Conference on Computer…
  • 17 June 2006
TLDR
We propose a graph based method to associate names and faces for querying people in large news photo collections using both text and visual appearances. Expand
  • 89
  • 11
  • PDF
Histogram of oriented rectangles: A new pose descriptor for human action recognition
TLDR
We propose a novel pose descriptor which we name as Histogram-of-Oriented Rectangles (HOR) for representing and recognizing human actions in videos. Expand
  • 131
  • 9
  • PDF
Human Action Recognition Using Distribution of Oriented Rectangular Patches
TLDR
We describe a "bag-of-rectangles" method for representing and recognizing human actions in videos. Expand
  • 112
  • 8
  • PDF
GCap: Graph-based Automatic Image Captioning
TLDR
We propose a novel, graph-based approach (GCap) which, when applied for the task of image captioning, outperforms previously reported methods. Expand
  • 118
  • 5
  • PDF
Clustering art
TLDR
We extend a recently developed method (K. Barnard and D. Forsyth, 2001) for learning the semantics of image databases using text and pictures using a probabilistic model due to Hofmann. Expand
  • 188
  • 4
  • PDF