Unifying Heterogenous Electronic Health Records Systems via Text-Based Code Embedding

  title={Unifying Heterogenous Electronic Health Records Systems via Text-Based Code Embedding},
  author={Kyunghoon Hur and Jiyoung Lee and Jungwoo Oh and Wesley Price and Young-Hak Kim and E. Choi},
BACKGROUND Substantial increase in the use of Electronic Health Records (EHRs) has opened new frontiers for predictive healthcare. However, while EHR systems are nearly ubiquitous, they lack a unified code system for representing medical concepts. Heterogeneous formats of EHR present a substantial barrier for the training and deployment of state-of-the-art deep learning models at scale. OBJECTIVE The aim of this study is to suggest a novel text embedding approach to overcome… 

UniHPF : Universal Healthcare Predictive Framework with Zero Domain Knowledge

Experimental results demonstrate that UniHPF is capable of building large-scale EHR models that can process any form of medical data from distinct EHR systems, and outperforms baseline models in multi-source learning tasks, including transfer and pooled learning, while also showing comparable results when trained on a single medical dataset.



MiME: Multilevel Medical Embedding of Electronic Health Records for Predictive Healthcare

Multilevel Medical Embedding (MiME) is proposed which learns the multilevel embedding of EHR data while jointly performing auxiliary prediction tasks that rely on this inherent EHR structure without the need for external labels.

Deep representation learning of electronic health records to unlock patient stratification at scale

It is demonstrated that ConvAE can generate patient representations that lead to clinically meaningful insights and can help better understand varying etiologies in heterogeneous sub-populations and unlock patterns for EHR-based research in the realm of personalized medicine.

Scalable and accurate deep learning with electronic health records

A representation of patients’ entire raw EHR records based on the Fast Healthcare Interoperability Resources (FHIR) format is proposed, and it is demonstrated that deep learning methods using this representation are capable of accurately predicting multiple medical events from multiple centers without site-specific data harmonization.

Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction

Inspired by BERT, Med-BERT is a contextualized embedding model pretrained on a structured EHR dataset of 28,490,650 patients that substantially improves the prediction accuracy and can boost the area under the receiver operating characteristics curve (AUC) by 1.21–6.14% in two disease prediction tasks from two clinical databases.

Learning the Graphical Structure of Electronic Health Records with Graph Convolutional Transformer

The proposed model consistently outperformed previous approaches empirically, on both synthetic data and publicly available EHR data, for various prediction tasks such as graph reconstruction and readmission prediction, indicating that it can serve as an effective general-purpose representation learning algorithm for E HR data.

Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records

The findings indicate that deep learning applied to EHRs can derive patient representations that offer improved clinical predictions, and could provide a machine learning framework for augmenting clinical decision systems.

GRAM: Graph-based Attention Model for Healthcare Representation Learning

Compared to the basic RNN, GRAM achieved 10% higher accuracy for predicting diseases rarely observed in the training data and 3% improved area under the ROC curve for predicting heart failure using an order of magnitude less training data.

Pre-training of Graph Augmented Transformers for Medication Recommendation

G-BERT is the first to bring the language model pre-training schema into the healthcare domain and it achieved state-of-the-art performance on the medication recommendation task.

Multi-layer Representation Learning for Medical Concepts

This work proposes Med2Vec, which not only learns the representations for both medical codes and visits from large EHR datasets with over million visits, but also allows us to interpret the learned representations confirmed positively by clinical experts.

Domain Knowledge Guided Deep Learning with Electronic Health Records

Experimental results on heart failure risk prediction tasks show that the proposed model not only outperforms state-of-the-art deep-learning based risk prediction models, but also associates individual medical events with heart failure onset, thus paving the way for interpretable accurate clinical risk predictions.