• Corpus ID: 203837658

Representation Learning of EHR Data via Graph-Based Medical Entity Embedding

  title={Representation Learning of EHR Data via Graph-Based Medical Entity Embedding},
  author={Tong Wu and Yunlong Wang and Yue Wang and Emily Zhao and Yilian Yuan and Zhi Yang},
Automatic representation learning of key entities in electronic health record (EHR) data is a critical step for healthcare informatics that turns heterogeneous medical records into structured and actionable information. Here we propose ME2Vec, an algorithmic framework for learning low-dimensional vectors of the most common entities in EHR: medical services, doctors, and patients. ME2Vec leverages diverse graph embedding techniques to cater for the unique characteristic of each medical entity… 

Figures and Tables from this paper

DUGRA: Dual-Graph Representation Learning for Health Information Networks
Experimental results show that the diagnosis embeddings learned from the model, DUal-GRAph Representation Learning (DUGRA), outperform the current state-of-the-art models in terms of diagnosis prediction accuracy.
Using Character-Level and Entity-Level Representations to Enhance Bidirectional Encoder Representation From Transformers-Based Clinical Semantic Textual Similarity Model: ClinicalSTS Modeling Study
Experimental results show that both character- level information and entity-level information can effectively enhance the BERT-based STS model.
Representation Learning for Networks in Biology and Medicine: Advancements, Challenges, and Opportunities
This review synthesizes a spectrum of algorithmic approaches that, at their core, leverage topological features to embed networks into compact vector spaces and provides a taxonomy of biomedical areas that are likely to benefit most from algorithmic innovation.
A Survey on Knowledge Enhanced EHR Data Mining
The knowledge types of EHR are summarized, the correct representation method of knowledge is given, and the specific application in each field is summarized, so that the development of knowledge enhancement technology in EHR can be promoted.
DBNet: a novel deep learning framework for mechanical ventilation prediction using electronic health records
A prediction model to estimate the probability of requiring mechanical ventilation for in-hospital patients at least 24 hours after their admission that outperforms the state-of-the-art baseline deep learning models in predicting the future requirement of mechanical ventilation.
Representation Learning for Diagnostic Data
A representation learning framework for the medical diagnosis domain is proposed, based on a heterogeneous network-based model of diagnostic data combined with an algorithm for learning latent node representation and a modification of metapath2vec algorithm is proposed for representation learning of heterogeneous networks.
Computer Information Systems and Industrial Management: 19th International Conference, CISIM 2020, Bialystok, Poland, October 16–18, 2020, Proceedings
The focus will be on discussing some of the relevant techniques used for solving the nurse scheduling problem, including a novel solution specifically aimed to increase patient satisfaction.


Measuring Patient Similarities via a Deep Architecture with Medical Concept Embedding
A patient similarity evaluation framework based on temporal matching of longitudinal patient EHRs, which takes a convolutional neural network architecture, and learns an optimal representation of patient clinical record through medical concept embedding.
MiME: Multilevel Medical Embedding of Electronic Health Records for Predictive Healthcare
Multilevel Medical Embedding (MiME) is proposed which learns the multilevel embedding of EHR data while jointly performing auxiliary prediction tasks that rely on this inherent EHR structure without the need for external labels.
Learning the Graphical Structure of Electronic Health Records with Graph Convolutional Transformer
The proposed model consistently outperformed previous approaches empirically, on both synthetic data and publicly available EHR data, for various prediction tasks such as graph reconstruction and readmission prediction, indicating that it can serve as an effective general-purpose representation learning algorithm for E HR data.
Graph Convolutional Transformer: Learning the Graphical Structure of Electronic Health Records
This paper argues that the Transformer is a suitable model to learn the hidden EHR structure, and proposes the Graph Convolutional Transformer, which uses data statistics to guide the structure learning process.
Medical Concept Embedding with Time-Aware Attention
This paper proposes to incorporate the temporal information to embed medical codes in EMRs using the Continuous Bag-of-Words model, which employs the attention mechanism to learn a ``soft'' time-aware context window for each medical concept.
Multi-task Sparse Metric Learning for Monitoring Patient Similarity Progression
The experimental results show that the proposed mtTSML, a multi-task triplet constrained sparse metric learning method, significantly outperforms the state-of-the-art baselines, including both single-task and multi- task metric learning methods.
Uncorrelated Patient Similarity Learning
This work proposes a novel uncorrelated patient similarity learning approach, which can not only select the most relevant features for the learning task, but also guarantee that the selected features have low correlations with each other.
Graph Attention Networks
We present graph attention networks (GATs), novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers to address the shortcomings of prior
node2vec: Scalable Feature Learning for Networks
In node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks, a flexible notion of a node's network neighborhood is defined and a biased random walk procedure is designed, which efficiently explores diverse neighborhoods.
LINE: Large-scale Information Network Embedding
A novel network embedding method called the ``LINE,'' which is suitable for arbitrary types of information networks: undirected, directed, and/or weighted, and optimizes a carefully designed objective function that preserves both the local and global network structures.