• Corpus ID: 240354191

Latent Structures Mining with Contrastive Modality Fusion for Multimedia Recommendation

@article{Zhang2021LatentSM,
  title={Latent Structures Mining with Contrastive Modality Fusion for Multimedia Recommendation},
  author={Jinghao Zhang and Yanqiao Zhu and Qiang Liu and Mengqi Zhang and Shu Wu and Liang Wang},
  journal={ArXiv},
  year={2021},
  volume={abs/2111.00678}
}
Multimedia content is of predominance in the modern Web era. Recent years have witnessed growing research interests in multimedia recommendation, which aims to predict whether a user will interact with an item with multimodal contents. Most previous studies focus on modeling user-item interactions with multimodal features included as side information. However, this scheme is not well-designed for multimedia recommendation. Firstly, only collaborative item-item relationships are implicitly… 

Figures and Tables from this paper

Multi-Modal Contrastive Pre-training for Recommendation
TLDR
A self-supervised contrastive inter-modal alignment task to make the textual and visual modalities as similar as possible as well as possible in order to exploit the potential correlation between users and items.

References

SHOWING 1-10 OF 83 REFERENCES
Graph-Refined Convolutional Network for Multimedia Recommendation with Implicit Feedback
TLDR
A new GCN-based recommender model, Graph-Refined Convolutional Network (GRCN), which adjusts the structure of interaction graph adaptively based on status of model training, instead of remaining the fixed structure is devised.
VBPR: Visual Bayesian Personalized Ranking from Implicit Feedback
TLDR
This paper proposes a scalable factorization model to incorporate visual signals into predictors of people's opinions, which is applied to a selection of large, real-world datasets and makes use of visual features extracted from product images using (pre-trained) deep networks.
Hierarchical User Intent Graph Network forMultimedia Recommendation
TLDR
A novel framework, Hierarchical User Intent Graph Network, is developed, which exhibits user intents in a hierarchical graph structure, from the fine-grained to coarse- grained intents, and achieves significant improvements over the state-of-the-art methods, including MMGCN and DisenGCN.
Self-supervised Graph Learning for Recommendation
TLDR
This work explores self-supervised learning on user-item graph, so as to improve the accuracy and robustness of GCNs for recommendation, and implements it on the state-of-the-art model LightGCN, which has the ability of automatically mining hard negatives.
Iterative Deep Graph Learning for Graph Neural Networks: Better and Robust Node Embeddings
TLDR
This paper proposes an end-to-end graph learning framework, namely Iterative Deep Graph Learning (IDGL), for jointly and iteratively learning graph structure and graph embedding and proposes a scalable version of IDGL, namely IDGL-ANCH, which significantly reduces the time and space complexity of ID GL without compromising the performance.
MMGCN: Multi-modal Graph Convolution Network for Personalized Recommendation of Micro-video
TLDR
A Multi-modal Graph Convolution Network (MMGCN) framework built upon the message-passing idea of graph neural networks, which can yield modal-specific representations of users and micro-videos to better capture user preferences is designed.
Neural Graph Collaborative Filtering
TLDR
This work develops a new recommendation framework Neural Graph Collaborative Filtering (NGCF), which exploits the user-item graph structure by propagating embeddings on it, effectively injecting the collaborative signal into the embedding process in an explicit manner.
DeepStyle: Learning User Preferences for Visual Recommendation
TLDR
A DeepStyle method is proposed for learning style features of items and sensing preferences of users and the effectiveness of DeepStyle for visual recommendation is illustrated.
Attentive Collaborative Filtering: Multimedia Recommendation with Item- and Component-Level Attention
TLDR
A novel attention mechanism in CF is introduced to address the challenging item- and component-level implicit feedback in multimedia recommendation, dubbed Attentive Collaborative Filtering (ACF), which significantly outperforms state-of-the-art CF methods.
Iterative Deep Graph Learning for Graph Neural Networks: Better and Robust ZHANG et al.: LATENT STRUCTURE MINING WITH CONTRASTIVE MODALITY FUSION FOR MULTIMEDIA RECOMMENDATION 13 Node Embeddings
  • NeurIPS, 2020, pp. 19 314–19 326.
  • 2020
...
...