Hamed R.-Tavakoli

Learn More
Multimedia analysis benefits from understanding the emotional content of a scene in a variety of tasks such as video genre classification and content-based image retrieval. Recently, there has been an increasing interest in applying human bio-signals, particularly eye movements, to recognize the emotional gist of a scene such as its valence. In order to(More)
This paper presents a framework for image captioning by exploiting the scene context. To date, most of the captioning models have been relying on the combination of Convolutional Neural Networks (CNN) and the Long-Short Term Memory (LSTM) model, trained in an end-to-end fashion. Recently, there has been extensive research towards improving the language(More)
There are errors in the Funding section. The correct funding information is as follows: This work is supported by Infotech Oulu, Academy of Finland Grant No. 259431, Nokia Scholarships, and The Finnish center of excellence in computational inference research (COIN). The funders had no role in study design, data collection and analysis, decision to publish,(More)
  • 1