Polina Kuznetsova

Learn More
We present a holistic data-driven approach to image description generation, exploiting the vast amount of (noisy) parallel image data and associated natural language descriptions available on the web. More specifically, given a query image, we retrieve existing human-composed phrases used to describe visually similar images, then selectively combine those(More)
We present a new tree based approach to composing expressive image descriptions that makes use of naturally occuring web images with captions. We investigate two related tasks: image caption generalization and generation, where the former is an optional subtask of the latter. The high-level idea of our approach is to harvest expressive phrases (as tree(More)
Understanding the connotation of words plays an important role in interpreting subtle shades of sentiment beyond denotative or surface meaning of text, as seemingly objective statements often allude nuanced sentiment of the writer, and even purposefully conjure emotion from the readers’ minds. The focus of this paper is drawing nuanced, connotative(More)
Preexisting or new-onset atrial fibrillation (AF) commonly occurs in patients with an acute coronary syndrome (ACS). However, it is currently unknown if previous or new-onset AF confers different risks in these patients. To determine the prognostic significance of new-onset and previous AF in patients with ACS, we evaluated all patients with ACS enrolled in(More)
This paper offers an approach for governments to harness the information contained in social media in order to make public inspections and disclosure more efficient. As a case study, we turn to restaurant hygiene inspections – which are done for restaurants throughout the United States and in most of the world and are a frequently cited example of public(More)
What is the story of an image? What is the relationship between pictures, language, and information we can extract using state of the art computational recognition systems? In an attempt to address both of these questions, we explore methods for retrieving and generating natural language descriptions for images. Ideally, we would like our generated textual(More)
The ever growing amount of web images and their associated texts offers new opportunities for integrative models bridging natural language processing and computer vision. However, the potential benefits of such data are yet to be fully realized due to the complexity and noise in the alignment between image content and text. We address this challenge with(More)
We present a new approach to harvesting a large-scale, high quality image-caption corpus that makes a better use of already existing web data with no additional human efforts. The key idea is to focus on Déjà Image-Captions: naturally existing image descriptions that are repeated almost verbatim – by more than one individual for different images. The(More)
We study Refer-to-as relations as a new type of semantic knowledge. Compared to the much studied Is-a relation, which concerns factual taxonomic knowledge,Refer-to-as relations aim to address pragmatic semantic knowledge. For example, a “penguin” is a “bird” from a taxonomic point of view, but people rarely refer to a “penguin” as a “bird” in vernacular(More)
Why do certain combinations of words such as “disadvantageous peace” or “metal to the petal” appeal to our minds as interesting expressions with a sense of creativity, while other phrases such as “quiet teenager”, or “geometrical base” not as much? We present statistical explorations to understand the characteristics of lexical compositions that give rise(More)