Ordia: A Web Application for Wikidata Lexemes

@inproceedings{Nielsen2019OrdiaAW,
  title={Ordia: A Web Application for Wikidata Lexemes},
  author={Finn {\AA}rup Nielsen},
  booktitle={ESWC},
  year={2019}
}
  • F. Nielsen
  • Published in ESWC 2 June 2019
  • Computer Science
Since 2018, Wikidata has had the ability to describe lexemes, and the associated SPARQL endpoint Wikidata Query Service can query this information and visualize the results. Ordia is a Web application that displays the multilingual lexeme data of Wikidata based on embedding of the responses from the Wikidata Query Service via templated SPARQL queries. Ordia has also a SPARQL-based approach for online matching of the words of a text with Wikidata lexemes and the ability to use a knowledge graph… 
Validating Danish Wikidata lexemes
TLDR
This work demonstrates the first application of ShEx for validation of entity data for Wikidata lexemes against ShEx schemas, and presents a use case and benchmark for ShEx and discusses its current limitations.
Lexemes in Wikidata: 2020 status
Wikidata now records data about lexemes, senses and lexical forms and exposes them as Linguistic Linked Open Data. Since lexemes in Wikidata was first established in 2018, this data has grown
Danish in Wikidata lexemes
TLDR
The lexicographic part of Wikidata as well as experiences with setting up lexemes for the Danish language are described and various possible annotations for lexeme are noted.
Architecture for a multilingual Wikipedia
TLDR
This paper proposes an architecture for a system that fulfills the goal in two parts: creating and maintaining content in an abstract notation within a project called Abstract Wikipedia, and creating an infrastructure called Wikilambda that can translate this notation to natural language.
Architecture for a multilingual Wikipedia
TLDR
This paper proposes an architecture for a system that fulfills the goal in two parts: creating and maintaining content in an abstract notation within a project called Abstract Wikipedia, and creating an infrastructure called Wikilambda that can translate this notation to natural language.

References

SHOWING 1-10 OF 10 REFERENCES
DBnary: Wiktionary as a Lemon-based multilingual lexical resource in RDF
TLDR
This article presents the extraction of multilingual lexical data from Wiktionary data and to provide it to the community as a Multilingual Lexical Linked Open Data (MLLOD).
Wembedder: Wikidata entity embedding web service
TLDR
A web service for querying an embedding of entities in the Wikidata knowledge graph using Gensim's Word2Vec implementation and a simple graph walk that exposes a multilingual resource for over 600'000Wikidata items and properties.
Scholia, Scientometrics and Wikidata
TLDR
How Wikidata has been used for bibliographic information and some scientometric statistics on this information are described and described.
RDF2Vec: RDF Graph Embeddings for Data Mining
TLDR
RDF2Vec is presented, an approach that uses language modeling approaches for unsupervised feature extraction from sequences of words, and adapts them to RDF graphs, and shows that feature vector representations of general knowledge graphs such as DBpedia and Wikidata can be easily reused for different tasks.
Software Framework for Topic Modelling with Large Corpora
TLDR
This work describes a Natural Language Processing software framework which is based on the idea of document streaming, i.e. processing corpora document after document, in a memory independent fashion, and implements several popular algorithms for topical inference, including Latent Semantic Analysis and Latent Dirichlet Allocation in a way that makes them completely independent of the training corpus size.
Wikidata
This collaboratively edited knowledgebase provides a common source of data for Wikipedia, and everyone else.
Wikidata: a free collaborative knowledgebase
TLDR
This collaboratively edited knowledgebase provides a common source of data for Wikipedia, and everyone else, to help improve the quality of the encyclopedia.
Efficient Estimation of Word Representations in Vector Space
TLDR
Two novel model architectures for computing continuous vector representations of words from very large data sets are proposed and it is shown that these vectors provide state-of-the-art performance on the authors' test set for measuring syntactic and semantic word similarities.
Let’s move forward with support for Wiktionary
  • Wikidata mailing list (September 2016),
  • 2016