Composite distance metric integration by leveraging multiple experts' inputs and its application in patient similarity assessment

  title={Composite distance metric integration by leveraging multiple experts' inputs and its application in patient similarity assessment},
  author={Fei Wang and Jimeng Sun and Shahram Ebadollahi},
  journal={Statistical Analysis and Data Mining: The ASA Data Science Journal},
In the real world, it is common that different experts have different opinions on the same problem due to their different experience. [] Key Method We formulate Comdi as a quadratic optimization problem and propose an efficient alternating strategy to find the solution. Besides learning a globally consistent metric, Comdi provides an elegant way to share knowledge across multiple experts without sharing the underlying data, which lowers the risk of disclosing private data. Our experiments on several…

A relative similarity based method for interactive patient risk prediction

An interactive patient risk prediction method is proposed, which actively queries medical experts with the relative similarity of patients, and makes several interesting discoveries including that querying relative similarities is effective in patient risk Prediction, and sometimes can even yield better prediction accuracy than asking for absolute questions.

Improving Clinical Subjects Clustering by Learning and Optimizing Feature Weights

It is shown that learning feature weights is necessary in order to generate meaningful separation of data in high dimensional space in unsupervised manner, based on silhouette score and principal component analysis.

Patient similarity: methods and applications

This review summarizes representative methods used in each step of typical a patient similarity study and discusses applications of patient similarity networks especially in the context of precision medicine.

Efficient Concept-based Document Ranking

This paper formally defines these important problems and shows that they pose unique algorithmic challenges due to the nature of the search and similarity semantics and the multi-level relationships between the concepts, and presents an efficient algorithm to compute the similarity between two EMRs.

A Patient Outcome Prediction based on Random Forest

To predict the patient's death outcome (namely death due to illness or still alive), this work makes full use of the visit records of patients and proposes a prediction method that combines the medical concept representation model Med2Vec with random forest algorithm.

Protein Fold Classification Using Large Margin Combination of Distance Metrics

By generalizing the concept of the large margin nearest neighbor (LMNN), a method for combining multiple distance metrics from different types of protein structure comparison methods for protein fold classification task is proposed and demonstrated on classification experiments using two public protein datasets.

Pediatric readmission classification using stacked regularized logistic regression models.

A novel approach to improved classification with shared predictive models for environments where centralized collection of data is not possible, which allows a high level of performance along with comprehensibility of obtained results.

Exploiting Cognitive Computing and Frame Semantic Features for Biomedical Document Clustering

This work shows how it is possible to cluster medical reports, based on features detected by using two emerging tools, IBM Watson and Framester, from a collection of text documents.



Localized Supervised Metric Learning on Temporal Physiological Data

This paper presents a method that leverages localized supervised metric learning to effectively incorporate physicians’ expert knowledge to arrive at semantically sound patient similarity measures.

Distance Metric Learning with Application to Clustering with Side-Information

This paper presents an algorithm that, given examples of similar (and, if desired, dissimilar) pairs of points in �”n, learns a distance metric over ℝn that respects these relationships.

Rank-based distance metric learning: An application to image retrieval

This work proposes rank-based distance metric learning for information retrieval by comparing the distances only among the relevant and irrelevant objects for a given query and applies the proposed framework to tattoo image retrieval in forensics and law enforcement application domain.

Predicting Patient's Trajectory of Physiological Data using Temporal Trends in Similar Patients: A System for Near-Term Prognostics.

A novel system is presented, which leverages inter-patient similarity for retrieving patients who display similar trends in their physiological time-series data, and which is used to project patient data into the future to provide insights for the query patient.

Measuring classifier performance: a coherent alternative to the area under the ROC curve

  • D. Hand
  • Computer Science
    Machine Learning
  • 2009
A simple valid alternative to the AUC is proposed, and the property of it being fundamentally incoherent in terms of misclassification costs is explored in detail.

Distance Metric Learning: A Comprehensive Survey

A number of techniques that are central to distance metric learning are discussed, including convex programming, positive semi-definite programming, kernel learning, dimension reduction, K Nearest Neighbor, large margin classification, and graph-based approaches.

Distance Metric Learning for Large Margin Nearest Neighbor Classification

This paper shows how to learn a Mahalanobis distance metric for kNN classification from labeled examples in a globally integrated manner and finds that metrics trained in this way lead to significant improvements in kNN Classification.

Two Heads Better Than One: Metric+Active Learning and its Applications for IT Service Classification

Two seemingly independent methods for classification of problem/change tickets are developed: Discriminative Neighborhood Metric Learning (DNML) and Active Learning with Median Selection (ALMS), both of which are, however, based on the same core technique: iterated representative selection.

Kernel-based distance metric learning for microarray data classification

A novel distance metric, derived from the procedure of a data-dependent kernel optimization, can substantially increase the class separability of the data in the feature space and lead to a significant improvement in the performance of the KNN classifier.

Semisupervised Metric Learning by Maximizing Constraint Margin

This paper considers the problem of learning a proper distance metric under the guidance of some weak supervisory information in the form of pairwise constraints which specify whether a pair of data points is in the same class ( must- link constraints) or in different classes ( cannot-link constraints).