Corpus ID: 210838584

Classifying Wikipedia in a fine-grained hierarchy: what graphs can contribute

  title={Classifying Wikipedia in a fine-grained hierarchy: what graphs can contribute},
  author={Tiphaine Viard and Thomas McLachlan and Hamidreza Ghader and S. Sekine},
Wikipedia is a huge opportunity for machine learning, being the largest semi-structured base of knowledge available. Because of this, many works examine its contents, and focus on structuring it in order to make it usable in learning tasks, for example by classifying it into an ontology. Beyond its textual contents, Wikipedia also displays a typical graph structure, where pages are linked together through citations. In this paper, we address the task of integrating graph (i.e. structure… Expand


Derivation of "is a" taxonomy from Wikipedia Category Graph
This paper proposes an approach for deriving "is a" taxonomy from the Wikipedia Categories Graph (WCG), which is an open collaborative resource and exploits a set of well-known benchmarks to compare the results obtained via the generated taxonomy to those achieved with WordNet, a resource created and maintained by domain experts. Expand
Fine-Grained Named Entity Classification with Wikipedia Article Vectors
This paper proposes to learn article vectors from hypertext structure of Wikipedia using a Skip-gram model and incorporate them into the input feature set and shows that the idea gained statistically significant improvements in classification results. Expand
Augmenting Wikipedia with Named Entity Tags
This paper investigates the task of labeling Wikipedia pages with standard named entity tags, which can be used further by a range of information extraction and language processing tools and builds a Web service that classifies any Wikipedia page. Expand
Multi-class Multilingual Classification of Wikipedia Articles Using Extended Named Entity Tag Set
This work introduces the Shinra 5-Language Categorization Dataset (SHINRA-5LDS), a large multi-lingual and multi-labeled set of annotated Wikipedia articles in Japanese, English, French, German, and Farsi using Extended Named Entity (ENE) tag set. Expand
Modeling Relational Data with Graph Convolutional Networks
It is shown that factorization models for link prediction such as DistMult can be significantly improved through the use of an R-GCN encoder model to accumulate evidence over multiple inference steps in the graph, demonstrating a large improvement of 29.8% on FB15k-237 over a decoder-only baseline. Expand
Learning multilingual named entity recognition from Wikipedia
The approach outperforms other approaches to automatic ne annotation; competes with gold-standard training when tested on an evaluation corpus from a different source; and performs 10% better than newswire-trained models on manually-annotated Wikipedia text. Expand
Inductive Representation Learning on Large Graphs
GraphSAGE is presented, a general, inductive framework that leverages node feature information (e.g., text attributes) to efficiently generate node embeddings for previously unseen data and outperforms strong baselines on three inductive node-classification benchmarks. Expand
Extended Named Entity Ontology with Attribute Information
The design of a set of attributes for ENE categories is reported on using a bottom up approach to creating the knowledge using a Japanese encyclopedia, which contains abundant descriptions of ENE instances. Expand
Us vs. Them: Understanding Social Dynamics in Wikipedia with Revert Graph Visualizations
Revert Graph is constructed, a tool that visualizes the overall conflict patterns between groups of users and enables visual analysis of opinion groups and rapid interactive exploration of those relationships via detail drill- downs. Expand
DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia
An overview of the DBpedia community project is given, including its architecture, technical implementation, maintenance, internationalisation, usage statistics and applications, including DBpedia one of the central interlinking hubs in the Linked Open Data (LOD) cloud. Expand