CN-DBpedia: A Never-Ending Chinese Knowledge Extraction System

  title={CN-DBpedia: A Never-Ending Chinese Knowledge Extraction System},
  author={Bo Xu and Yong Xu and Jiaqing Liang and Chenhao Xie and Bin Liang and Wanyun Cui and Yanghua Xiao},
Great efforts have been dedicated to harvesting knowledge bases from online encyclopedias. [] Key Method To solve these challenges, we propose a never-ending Chinese Knowledge extraction system, CN-DBpedia, which can automatically generate a knowledge base that is of ever-increasing in size and constantly updated. Specially, we reduce the human costs by reusing the ontology of existing knowledge bases and building an end-to-end facts extraction model.
CN-DBpedia2: An Extraction and Verification Framework for Enriching Chinese Encyclopedia Knowledge Base
An extraction and verification framework to enrich the knowledge bases and builds a new version of knowledge base CN-DBpedia2, which additionally contains the high confidence facts extracted from the description texts of entities.
Knowledge graph construction from multiple online encyclopedias
Experimental results show that the approaches for knowledge extraction and linking outperform state-of-the-art baselines in different evaluation metrics, and the framework can generate a large-scale knowledge graph after inputting multiple online encyclopedias.
IFTA: Iterative filtering by using TF-AICL algorithm for Chinese encyclopedia knowledge refinement
The precision, recall and F-measure results on the BaiduBaike and Hudong datasets indicate that the refining effects on open-domain Chinese encyclopedia KBs by the IFTA method outperform the state-of-the-art methods.
Research on Automatic Question Answering of Generative Knowledge Graph Based on Pointer Network
A new generative question answering method based on knowledge graph is proposed, including three parts of knowledge vocabulary construction, data pre-processing, and answer generation, which can achieve superior performance on WebQA datasets than other methods.
Towards the Completion of a Domain-Specific Knowledge Base with Emerging Query Terms
This paper uses the product knowledge base in the largest Chinese e-commerce platform, Taobao, as an example to investigate a completion procedure of a domain-specific knowledge base, and proposes a graph based solution to overcome many challenges.
Chapter 10 Introduction to Chinese Knowledge Graphs and their Applications
  • Tianxing Wu, G. Qi, Cheng Li
  • Computer Science, Education
    The New Silk Road Leads through the Arab Peninsula: Mast ering Global Business and Inovation
  • 2019
This chapter mainly introduces the development of Chinese knowledge graphs and their applications, and describes the background of OBOR, and introduces the concept of knowledge graph and three typical Chinese knowledge graph, including, CN-DBpedia, and XLORE.
Enabling Language Representation with Knowledge Graph and Structured Semantic Information
Sem-K-BERT is proposed, which integrates the information of KG and semantic role labeling before and after the BERT encoding layer, and introduces a context-aware knowledge screening mechanism based on semantic correlation calculation and a text-semantic alignment mechanism to effectively integrate the two external information and reduce the impact of noise.
A Survey of Techniques for Constructing Chinese Knowledge Graphs and Their Applications
This paper aims to introduce the techniques of constructing Chinese knowledge graphs and their applications, as well as analyse the impact of knowledge graph on OBOR.
CLEEK: A Chinese Long-text Corpus for Entity Linking
This work builds CLEEK, a Chinese corpus of multi-domain long text for entity linking, in order to encourage advancement of entity linking in languages besides English, and devise a measure to evaluate the difficulty of documents with respect to entity linking.
Decoding Chinese User Generated Categories for Fine-Grained Knowledge Harvesting
This paper introduces two word embedding projection models to identify is-a relations and proposes a graph clique mining algorithm to harvest non-taxonomic relations from UGCs, together with their textual patterns.


Yago: a core of semantic knowledge
YAGO builds on entities and relations and currently contains more than 1 million entities and 5 million facts, which includes the Is-A hierarchy as well as non-taxonomic relations between entities (such as HASONEPRIZE).
DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia
An overview of the DBpedia community project is given, including its architecture, technical implementation, maintenance, internationalisation, usage statistics and applications, including DBpedia one of the central interlinking hubs in the Linked Open Data (LOD) cloud.
Cross-Lingual Type Inference
This paper proposes a multi-label hierarchical classification algorithm to type Chinese entities with DBpedia types and exploits the cross-lingual entity linking between Chinese and English entities to construct the training data.
KBQA: An Online Template Based Question Answering System over Freebase
KBQA (Question Answering over Knowledge Bases) uses a new kind of question representation: templates, learned from a million scale QA corpora, which effectively and efficiently supports binary factoid questions or complex questions. - Weaving Chinese Linking Open Data
This paper presents, the first effort to publish large scale Chinese semantic data and link them together as a Chinese LOD (CLOD), and identifies important structural features in three largest Chinese encyclopedia sites for extraction and proposes several data-level mapping strategies for automatic link discovery.
DBpedia: A Nucleus for a Web of Open Data
The extraction of the DBpedia datasets is described, and how the resulting information is published on the Web for human-andmachine-consumption and how DBpedia could serve as a nucleus for an emerging Web of open data.
Freebase: a collaboratively created graph database for structuring human knowledge
MQL provides an easy-to-use object-oriented interface to the tuple data in Freebase and is designed to facilitate the creation of collaborative, Web-based data-oriented applications.
A Graph-based Recommendation across Heterogeneous Domains
This paper proposes a graph-based approach for recommendation across heterogeneous domains that uses a bipartite graph to represent the relationships between its entities and features and proposes an efficient propagation algorithm to obtain the similarity between entities fromheterogeneous domains.
Crowdsourcing research opportunities: lessons from natural language processing
The positive impacts that crowdsourcing has had on Natural Language Processing research are highlighted and the challenges of more complex methodologies, quality control, and the necessity to deal with ethical issues are discussed.
Towards End - to - End Knowledge Graph Construction via a Hybrid LSTM - RNN Framework