• Corpus ID: 220936378

Two-step penalised logistic regression for multi-omic data with an application to cardiometabolic syndrome

  title={Two-step penalised logistic regression for multi-omic data with an application to cardiometabolic syndrome},
  author={Alessandra Cabassi and Denis Seyres and Mattia Frontini and Paul D. W. Kirk},
  journal={arXiv: Methodology},
Building classification models that predict a binary class label on the basis of high dimensional multi-omics datasets poses several challenges, due to the typically widely differing characteristics of the data layers in terms of number of predictors, type of data, and levels of noise. Previous research has shown that applying classical logistic regression with elastic-net penalty to these datasets can lead to poor results (Liu et al., 2018). We implement a two-step approach to multi-omic… 

Transcriptional, epigenetic and metabolic signatures in cardiometabolic syndrome defined by extreme phenotypes

It was shown that the morbidly obese and lipodystrophy groups, despite some differences, shared a common cardiometabolic syndrome signature, and this could be used to discriminate, amongst the normal population, those individuals with a higher likelihood of presenting with the disease, even when not displaying the classic features.

Kernel learning approaches for summarising and combining posterior similarity matrices

The observation that PSMs are positive semi-definite can be used to define probabilistically-motivated kernel matrices that capture the clustering structure present in the data enables us to employ a range of kernel methods to obtain summary clusterings, and otherwise exploit the information summarised by PSMs.



Balancing the Robustness and Predictive Performance of Biomarkers

A number of strategies that combine assessments of stability and predictive performance in order to identify biomarkers that are both robust and diagnostically useful are presented and applied to identify a number of robust candidate biomarkers for the human disease HTLV1-associated myelopathy/tropical spastic paraparesis (HAM/TSP).

Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and Elaboration

In virtually all medical domains, diagnostic and prognostic multivariable prediction models are being developed, validated, updated, and implemented with the aim to assist doctors and individuals in estimating probabilities and potentially influence their decision making.

Integration of Ranked Lists via Cross Entropy Monte Carlo with Applications to mRNA and microRNA Studies

Formulating the problem of integrating ranked lists as minimizing an objective criterion, this work explores the usage of a cross entropy Monte Carlo method for solving such a combinatorial problem.

A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection

The results indicate that for real-word datasets similar to the authors', the best method to use for model selection is ten fold stratified cross validation even if computation power allows using more folds.

Robust rank aggregation for gene list integration and meta-analysis

This work proposes a novel robust rank aggregation (RRA) method that detects genes that are ranked consistently better than expected under null hypothesis of uncorrelated inputs and assigns a significance score for each gene.

A comparative study of rank aggregation methods for partial and top ranked lists in genomic applications

A systematic framework is proposed to define different situations that may occur based on the nature of individually ranked lists and general guidelines about which methods perform the best/worst, and under what conditions are provided.

Efficient parameter selection for support vector machines in classification and regression via model-based global optimization

  • H. FrohlichA. Zell
  • Computer Science
    Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005.
  • 2005
This paper proposes an algorithm to deal with the model selection problem, which is based on the idea of learning an online Gaussian process model of the error surface in parameter space and sampling systematically at points for which the so called expected improvement is highest.

Diagnosis and Management of the Metabolic Syndrome: An American Heart Association/National Heart, Lung, and Blood Institute Scientific Statement

This statement from the American Heart Association and the National Heart, Lung, and Blood Institute is intended to provide up-to-date guidance for professionals on the diagnosis and management of the metabolic syndrome in adults.

Gene prioritization through genomic data fusion

A bioinformatics approach, together with a freely accessible, interactive and flexible software termed Endeavour, to prioritize candidate genes underlying biological processes or diseases, based on their similarity to known genes involved in these phenomena, offers an alternative integrative method for gene discovery.

Plasma proteome analysis in HTLV-1-associated myelopathy/tropical spastic paraparesis

The results indicate that monocytes are activated by contact with activated endothelium in HAM and derive a diagnostic algorithm that correctly classified the disease status (presence or absence of HAM) in 81% of HTLV-1-infected subjects in the cohort.