Cross-Media Retrieval via Semantic Entity Projection

  title={Cross-Media Retrieval via Semantic Entity Projection},
  author={Lei Huang and Yuxin Peng},
Cross-media retrieval is becoming increasingly important nowadays. To address this challenging problem, most existing approaches project heterogeneous features into a unified feature space to facilitate their similarity computation. However, this unified feature space usually has no explicit semantic meanings, which might ignore the hints contained in the original media content, and thus is not able to fully measure the similarities among different media types. By considering the above issues… 

Cross-media retrieval by exploiting fine-grained correlation at entity level

Latent semantic factorization for multimedia representation learning

A novel multimedia representation learning framework via latent semantic factorization (LSF), where the posterior probability under the learned classifiers is served as the latent semantic representation for different modalities.

Joint graph regularization based semantic analysis for cross-media retrieval: a systematic review

The aim is to analysis the different cross-media retrieval with the joint graph regularization (JGR) to understand the various technique.

Cross Media Feature Retrieval and Optimization: A Contemporary Review of Research Scope, Challenges and Objectives

This manuscript is intended to brief the recent escalations and future research scope in regard to cross-media feature retrieval and optimization.



Harmonizing Hierarchical Manifolds for Multimedia Document Semantics Understanding and Cross-Media Retrieval

A Laplacian media object space is constructed for media object representation of each modality and an MMD semantic graph is constructed to perform cross-media retrieval and different methods are proposed to utilize relevance feedback.

Mining Semantic Correlation of Heterogeneous Multimedia Data for Cross-Media Retrieval

This paper proposes a method of transductive learning to mine the semantic correlations among media objects of different modalities so that to achieve the cross-media retrieval.

On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval

A mathematical formulation equating the design of cross-modal retrieval systems to that of isomorphic feature spaces for different content modalities is proposed, finding that both hypotheses hold, in a complementary form, although evidence in favor of the abstraction hypothesis is stronger than that for correlation.

A new approach to cross-modal multimedia retrieval

It is shown that accounting for cross-modal correlations and semantic abstraction both improve retrieval accuracy and are shown to outperform state-of-the-art image retrieval systems on a unimodal retrieval task.

Multimedia content processing through cross-modal association

This paper investigates different cross-modal association methods using the linear correlation model, and introduces a novel method for cross- modal association called Cross-modAL Factor Analysis (CFA), which shows several advantages in analysis performance and feature usage.

Towards semantic knowledge propagation from text corpus to web images

A mathematical model for the functional relationships between text and image features is developed so as to indirectly transfer semantic knowledge through feature transformations, which is accomplished by mapping instances from different domains into a common space of unspecific topics.

Supervised Coupled Dictionary Learning with Group Structures for Multi-modal Retrieval

Coupled dictionary learning is introduced into supervised sparse coding for multi-modal (cross-media) retrieval with group structures for Multi-Modal retrieval (SliM2), and the experimental results show the effectiveness of the proposed model when applied to cross-media retrieval.

Cross-modal Retrieval with Correspondence Autoencoder

The problem of cross-modal retrieval, e.g., using a text query to search for images and vice-versa, is considered in this paper. A novel model involving correspondence autoencoder (Corr-AE) is

Towards optimal bag-of-features for object categorization and semantic video retrieval

This paper evaluates various factors which govern the performance of Bag-of-features, and proposes a novel soft-weighting method to assess the significance of a visual word to an image and experimentally shows it can consistently offer better performance than other popular weighting methods.

ImageNet: A large-scale hierarchical image database

A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.