A whitening approach to probabilistic canonical correlation analysis for omics data integration

@article{Jendoubi2018AWA,
  title={A whitening approach to probabilistic canonical correlation analysis for omics data integration},
  author={Takoua Jendoubi and Korbinian Strimmer},
  journal={BMC Bioinformatics},
  year={2018},
  volume={20}
}
BackgroundCanonical correlation analysis (CCA) is a classic statistical tool for investigating complex multivariate data. Correspondingly, it has found many diverse applications, ranging from molecular biology and medicine to social science and finance. Intriguingly, despite the importance and pervasiveness of CCA, only recently a probabilistic understanding of CCA is developing, moving from an algorithmic to a model-based perspective and enabling its application to large-scale settings… 
The multirank likelihood for semiparametric canonical correlation analysis
Many analyses of multivariate data are focused on evaluating the dependence between two sets of variables, rather than the dependence among individual variables within each set. Canonical correlation
The multirank likelihood and cyclically monotone Monte Carlo: a semiparametric approach to CCA
TLDR
A semiparametric approach to CCA is presented in which the multivariate margins of each variable set may be arbitrary, but the dependence between variable sets is described by a parametric model that provides a low-dimensional summary of dependence.
SDGCCA: Supervised Deep Generalized Canonical Correlation Analysis for Multi-omics Integration
TLDR
A novel method of multi-omics integration called supervised deep generalized canonical correlation analysis (SDGCCA) is proposed, aiming for improving classification of phenotypes and revealing biomarkers related to phenotypes, which outperformed other CCA-based methods and other supervised methods.
Tensor Canonical Correlation Analysis With Convergence and Statistical Guarantees
TLDR
It is shown that carefully initialized the power method converges to the optimum and provide a finite sample bound, and the method can be used effectively in a large-scale data setting by solving the inner least squares problem with a stochastic gradient descent.
Orthonormal Canonical Correlation Analysis
TLDR
The orthonormal approximation of data matrices which corresponds to using singular value decomposition in the canonical correlations is presented, which can be helpful in managerial estimations and decision making.
Tensor Canonical Correlation Analysis
TLDR
This paper studies canonical correlation analysis by extending the framework of two dimensional analysis to tensor-valued data and proposes an efficient algorithm, called the higher-order power method, which is commonly used in tensor decomposition and more efficient for large-scale setting.
Approaches to Integrating Metabolomics and Multi-Omics Data: A Primer
TLDR
The purpose of this review is to look at various aspects that lead the choice of the statistical integrative analysis pipeline in terms of the different classes of statistical multi-omics data integration approaches into state-of-the-art classes under which all existing statistical methods fall.
Various dimension reduction techniques for high dimensional data analysis: a review
TLDR
A detailed investigation of various feature extraction and feature selection methods has been carried out with a systematic comparison of several dimension reduction techniques for the analysis of high dimensional data and to overcome the problem of data loss.
The Effect of Neuroepo on Cognition in Parkinson’s Disease Patients Is Mediated by Electroencephalogram Source Activity
We report on the quantitative electroencephalogram (qEEG) and cognitive effects of Neuroepo in Parkinson’s disease (PD) from a double-blind safety trial (https://clinicaltrials.gov/, number
Specific immune-regulatory transcriptional signatures reveal sex and age differences in SARS-CoV-2 infected patients
TLDR
It is found that female and young patients infected by SARS-CoV-2 exhibited a similar transcriptomic pattern with a larger number of total (up- and downregulated) differentially expressed genes (DEGs) compared to males and elderly patients.
...
...

References

SHOWING 1-10 OF 27 REFERENCES
Sparse canonical correlation analysis applied to ‐omics studies for integrative analysis and biomarker discovery
TLDR
The results from two studies show that SCCA could effectively find the correlated patterns between two data sets, which are of high importance for understanding the relationship between two underlying chemical or biological processes.
Sparse canonical correlation analysis from a predictive point of view
  • I. Wilms, C. Croux
  • Computer Science
    Biometrical journal. Biometrische Zeitschrift
  • 2015
TLDR
This paper considers the CCA problem from a predictive point of view and recast it into a regression framework and induces sparsity in the canonical vectors using an alternating regression approach together with a lasso penalty.
Fast regularized canonical correlation analysis
A novel algorithm for simultaneous SNP selection in high-dimensional genome-wide association studies
TLDR
A novel multivariate algorithm for large scale SNP selection using CAR score regression, a promising new approach for prioritizing biomarkers, that consistently outperforms all competing approaches, both uni- and multivariate, in terms of correctly recovered causal SNPs and SNP ranking.
A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics
TLDR
This work proposes a novel shrinkage covariance estimator that exploits the Ledoit-Wolf (2003) lemma for analytic calculation of the optimal shrinkage intensity and applies it to the problem of inferring large-scale gene association networks.
Sparse Canonical Correlation Analysis with Application to Genomic Data Integration
TLDR
This paper presents Sparse Canonical Correlation Analysis (SCCA) which examines the relationships between two types of variables and provides sparse solutions that include only small subsets of variables of each type by maximizing the correlation between the subsetsOf variables of different types while performing variable selection.
High-Dimensional Regression and Variable Selection Using CAR Scores
TLDR
The CAR score is introduced, a novel and highly effective criterion for variable ranking in linear regression based on Mahalanobis-decorrelation of the explanatory variables that provides a canonical ordering that encourages grouping of correlated predictors and down-weights antagonistic variables.
A robust predictive approach for canonical correlation analysis
Estimating high dimensional covariance matrices: A new look at the Gaussian conjugate framework
Optimal Whitening and Decorrelation
TLDR
It is demonstrated that investigating the cross-covariance and theCross-correlation matrix between sphered and original variables allows to break the rotational invariance and to identify optimal whitening transformations.
...
...