Every Picture Tells a Story: Generating Sentences from Images


Humans can prepare concise descriptions of pictures, focusing on what they find important. We demonstrate that automatic methods can do so too. We describe a system that can compute a score linking an image to a sentence. This score can be used to attach a descriptive sentence to a given image, or to obtain images that illustrate a given sentence. The score is obtained by comparing an estimate of meaning obtained from the image to one obtained from the sentence. Each estimate of meaning comes from a discriminative procedure that is learned using data. We evaluate on a novel dataset consisting of human-annotated images. While our underlying estimate of meaning is impoverished, it is sufficient to produce very good quantitative results, evaluated with a novel score that can account for synecdoche.

DOI: 10.1007/978-3-642-15561-1_2
View Slides

Extracted Key Phrases

Citations per Year

474 Citations

Semantic Scholar estimates that this publication has 474 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Farhadi2010EveryPT, title={Every Picture Tells a Story: Generating Sentences from Images}, author={Ali Farhadi and Mohsen Hejrati and Mohammad Amin Sadeghi and Peter Young and Cyrus Rashtchian and Julia Hockenmaier and David A. Forsyth}, booktitle={ECCV}, year={2010} }