Wikidata as a knowledge graph for the life sciences

@article{Waagmeester2020WikidataAA,
  title={Wikidata as a knowledge graph for the life sciences},
  author={A. Waagmeester and Gregory S. Stupp and S. Burgstaller-Muehlbacher and Benjamin M. Good and M. Griffith and O. Griffith and K. Hanspers and H. Hermjakob and Toby S. Hudson and K. Hybiske and S. Keating and M. Manske and Michael Mayers and D. Mietchen and Elvira Mitraka and Alexander R. Pico and T. Putman and Anders Riutta and N. Queralt-Rosinach and L. Schriml and Thomas M A Shafee and D. Slenter and R. Stephan and Katherine Thornton and Ginger Tsueng and Roger Tu and Sabah Ul-Hasan and Egon Willighagen and Chunlei Wu and A. Su},
  journal={eLife},
  year={2020},
  volume={9}
}
Wikidata is a community-maintained knowledge base that has been assembled from repositories in the fields of genomics, proteomics, genetic variants, pathways, chemical compounds, and diseases, and that adheres to the FAIR principles of findability, accessibility, interoperability and reusability. Here we describe the breadth and depth of the biomedical knowledge contained within Wikidata, and discuss the open-source tools we have built to add information to Wikidata and to synchronize it with… Expand
A protocol for adding knowledge to Wikidata, a case report
TLDR
This paper shows how a data schema required for the integration of knowledge can be modelled with entity schemas represented by Shape Expressions, and how this model can be used to make data between various resources interoperable. Expand
A protocol for adding knowledge to Wikidata: aligning resources on human coronaviruses
TLDR
This paper presents a model that can be used to make data between various resources interoperable is demonstrated by integrating data from NCBI Taxonomy, NCBI Genes, UniProt, and WikiPathways and describes the process of aligning resources on the genomes and proteomes of the SARS-CoV-2 virus and related viruses. Expand
Representing COVID-19 information in collaborative knowledge graphs: the case of Wikidata
Information related to the COVID-19 pandemic ranges from biological to bibliographic, from geographical to genetic and beyond. The structure of the raw data is highly complex, so converting it toExpand
Representing COVID-19 information in collaborative knowledge graphs: a study of Wikidata
TLDR
Four aspects of Wikidata are introduced that make it an ideal knowledge base for information on the COVID-19 pandemic: its flexible data model, its multilingual features, its alignment to multiple external databases, and its multidisciplinary organization. Expand
Using logical constraints to validate information in collaborative knowledge graphs: a study of COVID-19 on Wikidata
TLDR
This research paper catalogs an automatable task set necessary to assess and validate the portion of Wikidata relating to the COVID-19 disease, its causative virus, and key aspects of the resulting pandemic. Expand
WikiPathways: connecting communities
TLDR
The growth of WikiPathways over the last three years is shown, the new communities and collaborations of pathway authors and curators are highlighted, and various technologies to connect to external resources and initiatives are described. Expand
Strategies for Assembling the Biodiversity Knowledge Graph
TLDR
This talk explores different strategies for assembling the “biodiversity knowledge graph”, including a centralised, crowd-sourced approach using Wikidata as the foundation and ways that knowledge graphs could lead directly to visualising the links between taxonomy and the taxonomic literature. Expand
The LOTUS Initiative for Open Natural Products Research: Knowledge Management through Wikidata
TLDR
The newly established LOTUS initiative has now completed the first steps toward the harmonization, curation, validation and open dissemination of 700,000+ referenced structure-organism pairs, and embedding LOTUS data into the vast Wikidata knowledge graph will facilitate new biological and chemical insights. Expand
CROssBAR: comprehensive resource of biomedical relations with knowledge graph representations
TLDR
CROssBAR is a comprehensive system that integrates large-scale biological/biomedical data from various resources and stores them in a NoSQL database that is enriched with the deep-learning-based prediction of relationships between numerous data entries, followed by the rigorous analysis of the enriched data to obtain biologically meaningful modules. Expand
Automatic synchronization of RDF graphs representing ontologies and Wikibase instances
TLDR
This paper proposes a system that automatically synchronizes RDF files hosted in a version control system with a Wikibase instance and describes the system from an architectural point of view, and explains the main components needed for the synchronization of data. Expand
...
1
2
3
4
...

References

SHOWING 1-10 OF 109 REFERENCES
Wikidata: A large-scale collaborative ontological medical database
TLDR
The data model and characteristics of Wikidata are shown and it is explained how this database can be automatically processed by users as well as by computer methods and programs. Expand
WikiGenomes: an open web application for community consumption and curation of gene annotation data in Wikidata
Abstract With the advancement of genome-sequencing technologies, new genomes are being sequenced daily. Although these sequences are deposited in publicly available data warehouses, their functionalExpand
WikiPathways: a multifaceted pathway database bridging metabolomics to other omics research
TLDR
There is a doubling of the number of annotated metabolite nodes in WikiPathways and an OpenAPI documentation of the authors' web services and the FAIR annotation of resources to increase the interoperability of the knowledge encoded in these pathways and experimental omics data. Expand
Wikidata as a semantic framework for the Gene Wiki initiative
TLDR
A fully open and extensible data resource for human and mouse molecular biology and biochemistry data that enriches all the Wikipedias with structured information and serves as a new linking hub for the biological semantic web. Expand
Wikidata: a new platform for collaborative data collection
TLDR
This year, Wikimedia starts to build a new platform for the collaborative acquisition and maintenance of structured data: Wikidata, which will be a secondary database, i.e. instead of containing facts it will contain references for facts. Expand
The Reactome pathway knowledgebase
TLDR
The Reactome Web site and analysis tool set have been completely redesigned to increase speed, flexibility and user friendliness and the data model has been extended to support annotation of disease processes due to infectious agents and to mutation. Expand
Harmonising phenomics information for a better interoperability in the rare disease field.
TLDR
The HIPBI-RD ecosystem will contribute to the interpretation of variants identified through exome and full genome sequencing by harmonising the way phenotypic information is collected, thus improving diagnostics and delineation of RD. Expand
UniProt: a worldwide hub of protein knowledge
TLDR
The UniProt Knowledgebase is a collection of sequences and annotations for over 120 million proteins across all branches of life that has greatly expanded the number of Reference Proteomes that it provides and in particular it has focussed on improving thenumber of viral Reference Protesomes. Expand
Human Disease Ontology 2018 update: classification, content and workflow expansion
TLDR
The DO’s continual integration of human disease knowledge, evidenced by the more than 200 SVN/GitHub releases/revisions, includes the addition of 2650 new disease terms, a 30% increase of textual definitions, and an expanding suite of disease classification hierarchies constructed through defined logical axioms. Expand
Funding knowledgebases: Towards a sustainable funding model
Millions of life scientists across the world rely on bioinformatics data resources for their research projects. Data resources can be very expensive, especially those with a high added value as theExpand
...
1
2
3
4
5
...