• Corpus ID: 238743817

Understanding of Emotion Perception from Art

  title={Understanding of Emotion Perception from Art},
  author={Digbalay Bose and Krishna Somandepalli and Souvik Kundu and Rimita Lahiri and J. Gratch and Shrikanth S. Narayanan},
Computational modeling of the emotions evoked by art in humans is a challenging problem because of the subjective and nuanced nature of art and affective signals. In this paper, we consider the above-mentioned problem of understanding emotions evoked in viewers by artwork using both text and visual modalities. Specifically, we analyze images and the accompanying text captions from the viewers expressing emotions as a multimodal classification task. Our results show that single-stream multimodal… 

Figures and Tables from this paper


ArtEmis: Affective Language for Visual Art
We present a novel large-scale dataset and accompanying machine learning models aimed at providing a detailed understanding of the interplay between visual content, its emotional effect, and
Emotional valence categorization using holistic image features
An emotion categorization system, trained by ground truth from psychology studies, based on the assessment of local image statistics which are learned per emotional category using support vector machines is proposed.
WikiArt Emotions: An Annotated Dataset of Emotions Evoked by Art
It is found that pieces of art that depict faces draw more consistent emotional responses than those that do not, and systems that can transform existing art and generate new art that evokes the desired affectual response are developed.
The Cognitive Structure of Emotions
The boundaries of the theory Emotion words and cross-cultural issues Emotion experiences and unconscious emotions Coping and the function of emotions Computational tractability.
Can we understand van gogh's mood?: learning to infer affects from images in social networks
A semi-supervised framework is proposed to formulate the problem of inferring affects from images in social networks into a factor graph model and the effectiveness of the proposed method is demonstrated on automatically understanding van Gogh's Mood from his artworks, and inferring the trend of public affects around special event.
VisualBERT: A Simple and Performant Baseline for Vision and Language
Analysis demonstrates that VisualBERT can ground elements of language to image regions without any explicit supervision and is even sensitive to syntactic relationships, tracking, for example, associations between verbs and image regions corresponding to their arguments.
Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization
This work proposes a technique for producing ‘visual explanations’ for decisions from a large class of Convolutional Neural Network (CNN)-based models, making them more transparent and explainable, and shows that even non-attention based models learn to localize discriminative regions of input image.
VinVL: Revisiting Visual Representations in Vision-Language Models
This paper develops an improved object detection model to provide object-centric representations of images and feeds the visual features generated into a Transformer-based VL fusion model OSCAR, and utilizes an improved approach OSCar+ to pre-train the VL model and fine-tune it on a wide range of downstream VL tasks.
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
A combined bottom-up and top-down attention mechanism that enables attention to be calculated at the level of objects and other salient image regions is proposed, demonstrating the broad applicability of this approach to VQA.
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
We present ViLBERT (short for Vision-and-Language BERT), a model for learning task-agnostic joint representations of image content and natural language. We extend the popular BERT architecture to a