Pragmatic descriptions of perceptual stimuli

  title={Pragmatic descriptions of perceptual stimuli},
  author={Emiel van Miltenburg},
This research proposal discusses pragmatic factors in image description, arguing that current automatic image description systems do not take these factors into account. I present a general model of the human image description process, and propose to study this process using corpus analysis, experiments, and computational modeling. This will lead to a better characterization of human image description behavior, providing a road map for future research in automatic image description, and the… 
1 Citations

Figures and Tables from this paper

The Task Matters: Comparing Image Captioning and Task-Based Dialogical Image Description
It is demonstrated that careful design of data collection is required to obtain image descriptions which are contextually bounded to a particular meta-level task.


Pragmatic Factors in Image Description: The Case of Negations
This paper provides a qualitative analysis of the descriptions containing negations in the Flickr30K corpus, and a categorization of negation uses, and provides a set of requirements that an image description system should have in order to generate negation sentences.
From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions
This work proposes to use the visual denotations of linguistic expressions to define novel denotational similarity metrics, which are shown to be at least as beneficial as distributional similarities for two tasks that require semantic inference.
Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures
This survey classify the existing approaches based on how they conceptualize this problem, viz., models that cast description as either generation problem or as a retrieval problem over a visual or multimodal representational space.
Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics (Extended Abstract)
This work proposes to frame sentence-based image annotation as the task of ranking a given pool of captions, and introduces a new benchmark collection, consisting of 8,000 images that are each paired with five different captions which provide clear descriptions of the salient entities and events.
Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures (Extended Abstract)
An overview of the benchmark image-text datasets and the evaluation measures that have been developed to assess the quality of machine-generated descriptions and future directions in the area of automatic image description are explored.
Re-evaluating Automatic Metrics for Image Captioning
This paper provides an in-depth evaluation of the existing image captioning metrics through a series of carefully designed experiments and explores the utilization of the recently proposed Word Mover’s Distance document metric for the purpose of image Captioning.
Mechanisms of linguistic bias: How words reflect and maintain stereotypic expectancies
descriptions appropriately describe the given behavior. However, because the different LCM categories elicit different cognitive inferences, the implicit meaning that is communicated varies as a
The negation bias: when negations signal stereotypic expectancies.
Findings indicate that by using negations people implicitly communicate stereotypic expectancies and that negations play a subtle but powerful role in stereotype maintenance.
Show and tell: A neural image caption generator
This paper presents a generative model based on a deep recurrent architecture that combines recent advances in computer vision and machine translation and that can be used to generate natural sentences describing an image.
Seeing through the Human Reporting Bias: Visual Classifiers from Noisy Human-Centric Labels
This paper proposes an algorithm to decouple the human reporting bias from the correct visually grounded labels, and shows significant improvements over traditional algorithms for both image classification and image captioning, doubling the performance of existing methods in some cases.