Symbolic Graph Embedding Using Frequent Pattern Mining

  title={Symbolic Graph Embedding Using Frequent Pattern Mining},
  author={Bla{\vz} {\vS}krlj and Nada Lavrac and Jan Kralj},
Relational data mining is becoming ubiquitous in many fields of study. It offers insights into behaviour of complex, real-world systems which cannot be modeled directly using propositional learning. We propose Symbolic Graph Embedding (SGE), an algorithm aimed to learn symbolic node representations. Built on the ideas from the field of inductive logic programming, SGE first samples a given node’s neighborhood and interprets it as a transaction database, which is used for frequent pattern mining… 
tax2vec: Constructing Interpretable Features from Taxonomies for Short Text Classification
Transfer Learning for Node Regression Applied to Spreading Prediction
This paper is one of the first to explore transferability of the learned representations for the task of node regression, and shows there exist pairs of networks with similar structure between which the trained models can be transferred (zero-shot) and demonstrate their competitive performance.
SNoRe: Scalable Unsupervised Learning of Symbolic Node Representations
The proposed SNoRe (Symbolic Node Representations) algorithm is capable of learning symbolic, human-understandable representations of individual network nodes based on the similarity of neighborhood hashes to nodes chosen as features, and scales to large networks, making it suitable for many contemporary network analysis tasks.


RDF2Vec: RDF Graph Embeddings for Data Mining
RDF2Vec is presented, an approach that uses language modeling approaches for unsupervised feature extraction from sequences of words, and adapts them to RDF graphs, and shows that feature vector representations of general knowledge graphs such as DBpedia and Wikidata can be easily reused for different tasks.
Fast relational learning using bottom clause propositionalization with artificial neural networks
A fast method and system for relational learning based on a novel propositionalization called Bottom Clause Propositionalization (BCP) is introduced, which can achieve accuracy comparable to Aleph, and is extended to include a statistical feature selection method, mRMR, with preliminary results indicating that a reduction of more than 90 % of features can be achieved with a small loss of accuracy.
LINE: Large-scale Information Network Embedding
A novel network embedding method called the ``LINE,'' which is suitable for arbitrary types of information networks: undirected, directed, and/or weighted, and optimizes a carefully designed objective function that preserves both the local and global network structures.
node2vec: Scalable Feature Learning for Networks
In node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks, a flexible notion of a node's network neighborhood is defined and a biased random walk procedure is designed, which efficiently explores diverse neighborhoods.
HINMINE: heterogeneous information network mining with information retrieval heuristics
The results show that HINMINE, using different network decomposition methods, can significantly improve the performance of the resulting classifiers, and also that using a modified label propagation algorithm is beneficial when the data set is imbalanced.
metapath2vec: Scalable Representation Learning for Heterogeneous Networks
Two scalable representation learning models, namely metapath2vec and metapATH2vec++, are developed that are able to not only outperform state-of-the-art embedding models in various heterogeneous network mining tasks, but also discern the structural and semantic correlations between diverse network objects.
DeepWalk: online learning of social representations
DeepWalk is an online learning algorithm which builds useful incremental results, and is trivially parallelizable, which make it suitable for a broad class of real world applications such as network classification, and anomaly detection.
Large-Scale Assessment of Deep Relational Machines
Testing on datasets from the biochemical domain involving 100s of 1000s of instances; industrial-strength background predicates involving multiple hierarchies of complex definitions; and on classification and regression tasks provide substantially reliable evidence of the predictive capabilities of DRMs; along with a significant improvement in predictive performance with the incorporation of domain knowledge.
An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data
A novel approach named AGM to efficiently mine the association rules among the frequently appearing substructures in a given graph data set through the extended algorithm of the basket analysis is proposed.