CADA: Phenotype-driven gene prioritization based on a case-enriched knowledge graph

  title={CADA: Phenotype-driven gene prioritization based on a case-enriched knowledge graph},
  author={Chengyao Peng and Simon Dieck and Alexander Schmid and Ashar Ahmad and Alexej Knaus and Maren Wenzel and Laura Mehnert and Birgit Zirn and Tobias B Haack and Stephan Ossowski and Matias Wagner and Theresa Brunet and Nadja Ehmke and Magdalena Danyel and Stanislav Rosnev and Tom Kamphans and Guy Nadav and Nicole Fleischer and Holger Fr{\"o}hlich and Peter M. Krawitz},
Many rare syndromes can be well described and delineated from other disorders by a combination of characteristic symptoms. These phenotypic features are best documented with terms of the human phenotype ontology (HPO), which is increasingly used in electronic health records (EHRs), too. Many algorithms that perform HPO-based gene prioritization have also been developed, however, the performance of many such tools suffers from an overrepresentation of atypical cases in the medical literature… 
The GA4GH Phenopacket schema: A computable representation of clinical data for precision medicine
The GA4GH Phenopacket schema is a freely available, community-driven standard that streamlines exchange and systematic use of phenotypic data and will facilitate sophisticated computational analysis of both clinical and genomic information to help improve the understanding of diseases and the ability to manage them.
Graph Based Link Prediction between Human Phenotypes and Genes
This study developed a framework to predict links between human phenotype ontology (HPO) and genes using 5 different supervised machine learning algorithms and shows that the Gradient Boosting Decision Tree based model LightGBM is able to find more accurate interaction/link betweenhuman phenotype & gene pairs.


Expansion of the Human Phenotype Ontology (HPO) knowledge base and resources
The HPO’s interoperability with other ontologies has enabled it to be used to improve diagnostic accuracy by incorporating model organism data and plays a key role in the popular Exomiser tool, which identifies potential disease-causing variants from whole-exome or whole-genome sequencing data.
Clinical diagnostics in human genetics with semantic similarity searches in ontologies.
The differential diagnostic process attempts to identify candidate diseases that best explain a set of clinical features. This process can be complicated by the fact that the features can have
Interpretable Clinical Genomics with a Likelihood Ratio Paradigm
An approach to genomic diagnostics is presented that exploits the LR framework to provide an estimate of the posttest probability of candidate diagnoses; the LR for each observed HPO phenotype, and the predicted pathogenicity of observed genotypes.
Phen2Gene: Rapid Phenotype-Driven Gene Prioritization for Rare Diseases
Phen2Gene outperforms existing gene prioritization tools in speed, and acts as a real-time phenotype driven gene prioritized tool to aid the clinical diagnosis of rare undiagnosed diseases.
HPO2Vec+: Leveraging heterogeneous knowledge resources to enrich node embeddings for the Human Phenotype Ontology
The qualitative evaluation shows that the enriched HPO embeddings are generally able to detect relationships among nodes with fine granularity and HPOEmb-Orphanet is particularly good at associating phenotypes across different disease systems.
The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data
The updated HPO database is described, which provides annotations of 7,278 human hereditary syndromes listed in OMIM, Orphanet and DECIPHER to classes of the HPO, allowing integration of existing datasets and interoperability with multiple biomedical resources.
ClinVar: improving access to variant interpretations and supporting evidence
ClinVar continues to make improvements to its search and retrieval functions.
Phenolyzer: phenotype-based prioritization of candidate genes for human diseases
Phenolyzer is a tool that uses prior information to implicate genes involved in diseases, and exhibits superior performance over competing methods for prioritizing Mendelian and complex disease genes, based on disease or phenotype terms entered as free text.
node2vec: Scalable Feature Learning for Networks
In node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks, a flexible notion of a node's network neighborhood is defined and a biased random walk procedure is designed, which efficiently explores diverse neighborhoods.
Optuna: A Next-generation Hyperparameter Optimization Framework
New design-criteria for next-generation hyperparameter optimization software are introduced, including define-by-run API that allows users to construct the parameter search space dynamically, and easy-to-setup, versatile architecture that can be deployed for various purposes.