Data Science with Vadalog: Bridging Machine Learning and Reasoning

  title={Data Science with Vadalog: Bridging Machine Learning and Reasoning},
  author={Luigi Bellomarini and Ruslan R. Fayzrakhmanov and Georg Gottlob and Andrey Kravchenko and Eleonora Laurenza and Yavor Nenov and St{\'e}phane Reissfelder and Emanuel Sallinger and Evgeny Sherkhonov and Lianlong Wu},
Following the recent successful examples of large technology companies, many modern enterprises seek to build knowledge graphs to provide a unified view of corporate knowledge and to draw deep insights using machine learning and logical reasoning. [] Key Result We argue that this is a significant step forward towards combining machine learning and reasoning in data science.
Vadalog: Recent Advances and Applications
An easy-to-access self-contained introduction to Warded Datalog+/−, the logical core of Vadalog, is given, and some recent practical applications of the Vad analog language are presented: detection of close links in financial knowledge graphs, as well as the detection of family-owned businesses.
Knowledge Graphs and Big Data Processing
This introductory chapter serves to characterize the relevant aspects of the Big Data Ecosystem with respect to big data characteristics, the components needed for implementing end-to-end big data processing and the need for using semantics for improving the data management, integration, processing, and analytical tasks.
Weaving Enterprise Knowledge Graphs: The Case of Company Ownership Graphs
An in-depth case analysis of company ownership graphs, graphs having company ownership as a central concept, is presented and Vada-Link, a framework based on state-of-the-art approaches for knowledge representation and reasoning is presented.
Harmless but Useful: Beyond Separable Equality Constraints in Datalog+/-
This paper proposes a more general class of EGDs, which it is called “harmless”, that subsume separable EGDs and allow to model and reason about a much broader class of problems, and contributes a sufficient syntactic condition characterizing harmless EGDs.


Swift Logic for Big Data and Knowledge Graphs - Overview of Requirements, Language, and System
The vadalog system is introduced, which exploits the theoretical underpinning of relevant Datalog languages and combines it with existing and novel techniques from database and AI practice, and puts these swift logics into action.
Swift Logic for Big Data and Knowledge Graphs
This workulates various requirements for a fully-fledged KGMS and presents KRR formalisms and a system achieving these goals.
The Vadalog System: Swift Logic for Big Data and Enterprise Knowledge Graphs
A KGMS is viewed as a knowledge base management system (KBMS), which performs complex rule-based reasoning tasks over very large amounts of data and provides methods and tools for data analytics and machine learning.
Tuffy: Scaling up Statistical Inference in Markov Logic Networks using an RDBMS
This work presents Tuffy, a scalable Markov Logic Networks framework that achieves scalability via three novel contributions: a bottom-up approach to grounding, a novel hybrid architecture that allows to perform AI-style local search efficiently using an RDBMS, and a theoretical insight that shows when one can improve the efficiency of stochastic local search.
The Vadalog System: Datalog-based Reasoning for Knowledge Graphs
The Vadalog system is presented, a Datalog-based system for performing complex logic reasoning tasks, such as those required in advanced knowledge graphs, and the first implementation of Warded Datalogs+/- is illustrated, a high-performanceDatalog+/- system utilizing an aggressive termination control strategy.
SlimShot: In-Database Probabilistic Inference for Knowledge Bases
SlimShot is described, a probabilistic inference engine for knowledge bases that uses a simple Monte Carlo-based inference, with three key enhancements: it combines sampling with safe query evaluation, and it estimates a conditional probability by jointly computing the numerator and denominator.
Entity Resolution with Markov Logic
A well-founded, integrated solution to the entity resolution problem based on Markov logic, which combines first-order logic and probabilistic graphical models by attaching weights to first- order formulas, and viewing them as templates for features of Markov networks.
Hinge-Loss Markov Random Fields and Probabilistic Soft Logic: A Scalable Approach to Structured Prediction
An algorithm for inferring most-probable variable assignments (MAP inference) that is much more scalable than general-purpose convex optimization methods, because it uses message passing to take advantage of sparse dependency structures.
Inference and learning in probabilistic logic programs using weighted Boolean formulas
The results show that the inference algorithms improve upon the state of the art in probabilistic logic programming, and that it is indeed possible to learn the parameters of a probabilist logic program from interpretations.
Markov logic networks
Experiments with a real-world database and knowledge base in a university domain illustrate the promise of this approach to combining first-order logic and probabilistic graphical models in a single representation.