Thi Quynh Nhi Tran

Learn More
Cross-modal tasks occur naturally for multimedia content that can be described along two or more modalities like visual content and text. Such tasks require to "translate" information from one modality to another. Methods like kernelized canonical correlation analysis (KCCA) attempt to solve such tasks by finding aligned subspaces in the description spaces(More)
This paper describes our participation to the ImageCLEF 2016 scalable concept image annotation main task and Text Illustration teaser. Regarding image annotation, we focused on better localizing the detected features. For this, we identified the saliency of the image to collect a list of potential interesting places into the image. We also added a specific(More)
We argue that cross-modal classification, where models are trained on data from one modality (e.g. text) and applied to data from another (e.g. image), is a relevant problem in multimedia retrieval. We propose a method that addresses this specific problem, related to but different from cross-modal retrieval and bimodal classification. This method relies on(More)
This is the supplementary material to the article Aggregating Image and Text Quantized Correlated Components published at CVPR 2016. While not necessary to understand the work exposed in the paper, it reports some complementary results or “interesting negative results”. Last section focuses on providing implementation details to reproduce the experiments,(More)
Cross-modal retrieval increasingly relies on joint statistical models built from large amounts of data represented according to several modalities. However, some information that is poorly represented by these models can be very significant for a retrieval task. We show that, by appropriately identifying and taking such information into account, the results(More)
  • 1