• Corpus ID: 240354191

Latent Structures Mining with Contrastive Modality Fusion for Multimedia Recommendation

  title={Latent Structures Mining with Contrastive Modality Fusion for Multimedia Recommendation},
  author={Jinghao Zhang and Yanqiao Zhu and Qiang Liu and Mengqi Zhang and Shu Wu and Liang Wang},
Multimedia content is of predominance in the modern Web era. Recent years have witnessed growing research interests in multimedia recommendation, which aims to predict whether a user will interact with an item with multimodal contents. Most previous studies focus on modeling user-item interactions with multimodal features included as side information. However, this scheme is not well-designed for multimedia recommendation. Firstly, only collaborative item-item relationships are implicitly… 

Figures and Tables from this paper

Multi-Modal Contrastive Pre-training for Recommendation
A self-supervised contrastive inter-modal alignment task to make the textual and visual modalities as similar as possible as well as possible in order to exploit the potential correlation between users and items.


Graph-Refined Convolutional Network for Multimedia Recommendation with Implicit Feedback
A new GCN-based recommender model, Graph-Refined Convolutional Network (GRCN), which adjusts the structure of interaction graph adaptively based on status of model training, instead of remaining the fixed structure is devised.
VBPR: Visual Bayesian Personalized Ranking from Implicit Feedback
This paper proposes a scalable factorization model to incorporate visual signals into predictors of people's opinions, which is applied to a selection of large, real-world datasets and makes use of visual features extracted from product images using (pre-trained) deep networks.
Hierarchical User Intent Graph Network forMultimedia Recommendation
A novel framework, Hierarchical User Intent Graph Network, is developed, which exhibits user intents in a hierarchical graph structure, from the fine-grained to coarse- grained intents, and achieves significant improvements over the state-of-the-art methods, including MMGCN and DisenGCN.
Self-supervised Graph Learning for Recommendation
This work explores self-supervised learning on user-item graph, so as to improve the accuracy and robustness of GCNs for recommendation, and implements it on the state-of-the-art model LightGCN, which has the ability of automatically mining hard negatives.
Iterative Deep Graph Learning for Graph Neural Networks: Better and Robust Node Embeddings
This paper proposes an end-to-end graph learning framework, namely Iterative Deep Graph Learning (IDGL), for jointly and iteratively learning graph structure and graph embedding and proposes a scalable version of IDGL, namely IDGL-ANCH, which significantly reduces the time and space complexity of ID GL without compromising the performance.
MMGCN: Multi-modal Graph Convolution Network for Personalized Recommendation of Micro-video
A Multi-modal Graph Convolution Network (MMGCN) framework built upon the message-passing idea of graph neural networks, which can yield modal-specific representations of users and micro-videos to better capture user preferences is designed.
Neural Graph Collaborative Filtering
This work develops a new recommendation framework Neural Graph Collaborative Filtering (NGCF), which exploits the user-item graph structure by propagating embeddings on it, effectively injecting the collaborative signal into the embedding process in an explicit manner.
DeepStyle: Learning User Preferences for Visual Recommendation
A DeepStyle method is proposed for learning style features of items and sensing preferences of users and the effectiveness of DeepStyle for visual recommendation is illustrated.
Attentive Collaborative Filtering: Multimedia Recommendation with Item- and Component-Level Attention
A novel attention mechanism in CF is introduced to address the challenging item- and component-level implicit feedback in multimedia recommendation, dubbed Attentive Collaborative Filtering (ACF), which significantly outperforms state-of-the-art CF methods.
Iterative Deep Graph Learning for Graph Neural Networks: Better and Robust ZHANG et al.: LATENT STRUCTURE MINING WITH CONTRASTIVE MODALITY FUSION FOR MULTIMEDIA RECOMMENDATION 13 Node Embeddings
  • NeurIPS, 2020, pp. 19 314–19 326.
  • 2020