Data-Driven Joint Debugging of the DBpedia Mappings and Ontology - Towards Addressing the Causes Instead of the Symptoms of Data Quality in DBpedia

@inproceedings{Paulheim2017DataDrivenJD,
  title={Data-Driven Joint Debugging of the DBpedia Mappings and Ontology - Towards Addressing the Causes Instead of the Symptoms of Data Quality in DBpedia},
  author={H. Paulheim},
  booktitle={ESWC},
  year={2017}
}
DBpedia is a large-scale, cross-domain knowledge graph extracted from Wikipedia. For the extraction, crowd-sourced mappings from Wikipedia infoboxes to the DBpedia ontology are utilized. In this process, different problems may arise: users may create wrong and/or inconsistent mappings, use the ontology in an unforeseen way, or change the ontology without considering all possible consequences. In this paper, we present a data-driven approach to discover problems in mappings as well as in the… Expand
Predicting incorrect mappings: a data-driven approach applied to DBpedia
TLDR
This work proposes a data-driven method to detect incorrect mappings automatically by analyzing the information from both instance data as well as ontological axioms and concludes that the best model achieves 93% accuracy. Expand
Sustainable Linked Data Generation: The Case of DBpedia
TLDR
This paper proposes adapting a different semantic-driven approach that decouples, in a declarative manner, the extraction, transformation and mapping rules execution of the dbpedia ef, and achieves an enhanced data generation process that improves its quality, coverage and sustainability. Expand
Resolving Range Violations in DBpedia
TLDR
The proposed approach is based on graph analysis and keyword matching and outperforms various baseline methods, including entity search and knowledge graph completion, and exploits information from the incorrect objects because they contain useful clues to find the correct objects. Expand
Rule-driven inconsistency resolution for knowledge graph generation rules
TLDR
Resglass is described, which includes a ranking to determine the order with which rules and ontology elements should be inspected, and its implementation, and the evaluation shows that the automatic ranking achieves an overlap of 80% with experts ranking, reducing this way the effort required during the resolution of inconsistencies. Expand
Automatic refinement of large-scale cross-domain knowledge graphs
TLDR
This thesis investigates the problem of automatic knowledge graph refinement and proposes methods that address the problem from two directions, automatic refinement of the TBox and of the ABox. Expand
Repairing mappings across biomedical ontologies by probabilistic reasoning and belief revision
TLDR
A novel approach for repairing biomedical ontology mappings by probabilistic reasoning and belief revision techniques, featuring a combination of removal strategy and revision strategy, suitable for applications like ontology-supported medical information retrieval, semantic annotation and indexing of medical articles, and matchmaking and ranking objects among multiple ontologies. Expand
DBkWik: extracting and integrating knowledge from thousands of Wikis
TLDR
This paper shows how to create one consolidated knowledge graph, called DBkWik, from thousands of Wikis, and shows that the resulting large-scale knowledge graph is complementary to DBpedia. Expand
R2RML and RML Comparison for RDF Generation, their Rules Validation and Inconsistency Resolution
In this paper, an overview of the state of the art on knowledge graph generation is provided, with focus on the two prevalent mapping languages: the W3C recommended R2RML and its generalisation RML.Expand
Inferring Resource Types in Knowledge Graphs using NLP analysis and human in-the-loop validation: The DBpedia Case
Defining proper semantic types for resources in Knowledge Graphs is one of the key steps on building high quality data. Often, this information is either missing or incorrect. Thus it is crucial toExpand
DBkWik: A Consolidated Knowledge Graph from Thousands of Wikis
TLDR
This paper shows how to create one consolidated knowledge graph, called DBkWik, from thousands of Wikis, and shows that the resulting large-scale knowledge graph is complementary to DBpedia. Expand
...
1
2
...

References

SHOWING 1-10 OF 28 REFERENCES
DBpedia Mappings Quality Assessment
TLDR
This work demonstrates how mappings validation is applied to dbpedia, and proposes in previous work to validate the mappings which generate the data, instead of validating the generated data afterwards. Expand
DBpedia ontology enrichment for inconsistency detection
TLDR
In order to enable the detection of inconsistencies this work focuses on the enrichment of the DBpedia ontology by statistical methods, in a way that inconsistencies are detected during the extraction of Wikipedia data. Expand
Serving DBpedia with DOLCE - More than Just Adding a Cherry on Top
TLDR
It is shown that by aligning the DBpedia ontology to the foundational ontology DOLCE-Zero, and by combining reasoning and clustering of the reasoning results, errors affecting millions of statements can be identified at a minimal workload for the knowledge base designer. Expand
Checking and Handling Inconsistency of DBpedia
TLDR
This paper checks the inconsistency in DBpedia by rule-based distributed reasoning using MapReduce and shows that there are a number of inconsistencies, which should be handled with different methods to improve data quality of DBpedia. Expand
DBpedia Live Extraction
TLDR
DBpedia is extended with a live extraction framework, which is capable of processing tens of thousands of changes per day in order to consume the constant stream of Wikipedia updates and allows direct modifications of the knowledge base and closer interaction of users with DBpedia. Expand
RML: A Generic Language for Integrated RDF Mappings of Heterogeneous Data
TLDR
The rml mapping language is introduced, a generic language based on an extension over r2rml, the w3c standard for mapping relational databases into rdf, which becomes source-agnostic and extensible, while facilitating the denition of mappings of multiple heterogeneous sources. Expand
WhoKnows? Evaluating linked data heuristics with a quiz that cleans up DBpedia
Purpose – Linking Open Data (LOD) provides a vast amount of well structured semantic information, but many inconsistencies may occur, especially if the data are generated with the help of automatedExpand
ORE - A Tool for Repairing and Enriching Knowledge Bases
TLDR
ORE supports the detection of a variety of ontology modelling problems and guides the user through the process of resolving them and allows to extend an ontology through (semi-)automatic supervised learning. Expand
Knowledge graph refinement: A survey of approaches and evaluation methods
TLDR
A survey of such knowledge graph refinement approaches, with a dual look at both the methods being proposed as well as the evaluation methodologies used. Expand
Fast Approximate A-Box Consistency Checking Using Machine Learning
TLDR
It is shown that it is possible to approximate a central task in reasoning, i.e., A-box consistency checking, by training a machine learning model which approximates the behavior of that reasoner for a specific ontology. Expand
...
1
2
3
...