• Corpus ID: 235358354

Scientific Dataset Discovery via Topic-level Recommendation

@article{Altaf2021ScientificDD,
  title={Scientific Dataset Discovery via Topic-level Recommendation},
  author={Basmah Altaf and Shichao Pei and Xiangliang Zhang},
  journal={ArXiv},
  year={2021},
  volume={abs/2106.03399}
}
Data intensive research requires the support of appropriate datasets. However, it is often time-consuming to discover usable datasets matching a specific research topic. We formulate the dataset discovery problem on an attributed heterogeneous graph, which is composed of paper-paper citation, paper-dataset citation and also paper content. We propose to characterize both paper and dataset nodes by their commonly shared latent topics, rather than learning user and item representations via… 

Figures and Tables from this paper

References

SHOWING 1-10 OF 49 REFERENCES
Dataset Recommendation via Variational Graph Autoencoder
TLDR
This paper proposes to learn representations of research papers and datasets in the two-layer network using heterogeneous variational graph autoencoder, and then compute the relevance of the query to the dataset candidates based on the learned representations.
Delve: A Dataset-Driven Scholarly Search and Analysis System
TLDR
Delve is dataset driven and provides a medium for dataset retrieval based on the suitability or usage in a given field, and it also visualizes dataset and document citation relationship, and enables users to analyze a scientific document by uploading its full PDF.
ClusCite: effective citation recommendation by information network-based clustering
TLDR
A novel cluster-based citation recommendation framework, called ClusCite, which explores the principle that citations tend to be softly clustered into interest groups based on multiple types of relationships in the network, and learns group memberships for objects and the significance of relevance features for each interest group by solving a joint optimization problem.
Collaborative topic modeling for recommending scientific articles
TLDR
An algorithm to recommend scientific articles to users of an online community that combines the merits of traditional collaborative filtering and probabilistic topic modeling and can form recommendations about both existing and newly published articles is developed.
Ratings meet reviews, a combined approach to recommend
TLDR
A unified model that combines content-based filtering with collaborative filtering, harnessing the information of both ratings and reviews is proposed, which can alleviate the cold-start problem and learn latent topics that are interpretable.
A Synthetic Approach for Recommendation: Combining Ratings, Social Relations, and Reviews
TLDR
This paper proposes a novel framework MR3 to jointly model these three types of information effectively for rating prediction by aligning latent factors and hidden topics, and achieves more accurate rating prediction on two real-life datasets.
Content-based recommendations with Poisson factorization
We develop collaborative topic Poisson factorization (CTPF), a generative model of articles and reader preferences. CTPF can be used to build recommender systems by learning from reader histories and
TopicMF: Simultaneously Exploiting Ratings and Reviews for Recommendation
TLDR
Experimental results show the superiority of the proposed novel matrix factorization model (called TopicMF) over the state-of-the-art models, demonstrating its effectiveness for recommendation tasks.
Context-Aware Collaborative Topic Regression with Social Matrix Factorization for Recommender Systems
TLDR
A novel context-aware hierarchical Bayesian method that can make predictions for each user-item subgroup, which incorporate not only topic modeling to mine item content but also social matrix factorization to handle ratings and social relationships.
Dual Graph Attention Networks for Deep Latent Representation of Multifaceted Social Effects in Recommender Systems
TLDR
Considering that different social effects in two domains could interact with each other and jointly influence users' preferences for items, a new policy-based fusion strategy based on contextual multi-armed bandit to weigh interactions of various social effects is proposed.
...
...