# Integrated Principal Components Analysis

@article{Tang2021IntegratedPC, title={Integrated Principal Components Analysis}, author={Tiffany M. Tang and Genevera I. Allen}, journal={J. Mach. Learn. Res.}, year={2021}, volume={22}, pages={198:1-198:71} }

Data integration, or the strategic analysis of multiple sources of data simultaneously, can often lead to discoveries that may be hidden in individualistic analyses of a single data source. We develop a new unsupervised data integration method named Integrated Principal Components Analysis (iPCA), which is a model-based generalization of PCA and serves as a practical tool to find and visualize common patterns that occur in multiple data sets. The key idea driving iPCA is the matrix-variate…

## Figures and Tables from this paper

## 13 Citations

Integrative Generalized Convex Clustering Optimization and Feature Selection for Mixed Multi-View Data

- Computer ScienceJ. Mach. Learn. Res.
- 2021

The iGecco+ approach selects features from each data view that are best for determining the groups, often leading to improved integrative clustering, and develops a new type of generalized multi-block ADMM algorithm using sub-problem approximations that more efficiently fits the model for big data sets.

Stepwise Covariance-Free Common Principal Components (CF-CPC) With an Application to Neuroscience

- Computer ScienceFrontiers in Neuroscience
- 2021

A covariance-free stepwise CPC, which only requires O(kn) memory, where n is the total number of examples, is proposed, which allows extracting the shared anatomical structure of EEG and MEG source spectra across a frequency range of 0.01–40 Hz.

Integrative analysis of multi-omics data improves model predictions: an application to lung cancer

- BiologybioRxiv
- 2020

This work shows how an integrative analysis that preserves both components of variation is more appropriate than analyses considering uniquely individual or joint components, and shows how both joint and individual components contribute to a better quality of model predictions, and facilitate the interpretation of the underlying biological processes.

Computationally Efficient Learning of Statistical Manifolds

- Computer Science
- 2021

It is demonstrated how underlying structures in high dimensional data, including anomalies, can be visualized and identified, in a way that is scalable to large datasets, and is robust to different manifold learning algorithms and different approximate nearest neighbor algorithms.

Integrative, multi-omics, analysis of blood samples improves model predictions: applications to cancer

- BiologyBMC Bioinform.
- 2021

This work identifies joint and individual contributions of DNA methylation, miRNA and mRNA expression collected from blood samples in a lung cancer case–control study nested within the Norwegian Women and Cancer (NOWAC) cohort study, and uses such components to build prediction models for case– control and metastatic status.

Stacked Autoencoder Based Multi-Omics Data Integration for Cancer Survival Prediction

- Computer ScienceArXiv
- 2022

This paper proposes a novel method to integrate multi-omics data for cancer survival prediction, called Stacked AutoEncoder-based Survival Prediction Neural Network (SAEsurv-net), and addresses the curse of dimensionality with a two-stage dimensionality reduction strategy and handlesMulti-omics heterogeneity with a stacked autoencoder model.

Integration strategies of multi-omics data for machine learning analysis

- Computer ScienceComputational and structural biotechnology journal
- 2021

No-go Theorem for Acceleration in the Hyperbolic Plane

- MathematicsArXiv
- 2021

It is proved that in a noisy setting, there is no analogue of accelerated gradient descent for geodesically convex functions on the hyperbolic plane.

Principal Components Along Quiver Representations

- MathematicsFoundations of Computational Mathematics
- 2022

Quiver representations arise naturally in many areas across mathematics. Here we describe an algorithm for calculating the vector space of sections, or compatible assignments of vectors to vertices,…

A No-go Theorem for Robust Acceleration in the Hyperbolic Plane

- MathematicsNeurIPS
- 2021

In recent years there has been significant effort to adapt the key tools and ideas in convex optimization to the Riemannian setting. One key challenge has remained: Is there a Nesterov-like…

## References

SHOWING 1-10 OF 46 REFERENCES

Distributed estimation of principal eigenspaces.

- Computer ScienceAnnals of statistics
- 2019

It is shown that when the number of machines is not unreasonably large, the distributed PCA performs as well as the whole sample PCA, even without full access of whole data.

JOINT AND INDIVIDUAL VARIATION EXPLAINED (JIVE) FOR INTEGRATED ANALYSIS OF MULTIPLE DATA TYPES.

- Computer ScienceThe annals of applied statistics
- 2013

JIVE quantifies the amount of joint variation between data types, reduces the dimensionality of the data, and provides new directions for the visual exploration of joint and individual structure.

TRANSPOSABLE REGULARIZED COVARIANCE MODELS WITH AN APPLICATION TO MISSING DATA IMPUTATION.

- Computer Science, MathematicsThe annals of applied statistics
- 2010

Simulations and results on microarray data and the Netflix data show that these imputation techniques often outperform existing methods and offer a greater degree of flexibility.

Multiple factor analysis: principal component analysis for multitable and multiblock data sets

- Computer Science
- 2013

This article presents MFA, reviews recent extensions, and illustrates it with a detailed example that shows the common factor scores could be obtained by replacing the original normalized data tables by the normalized factor scores obtained from the PCA of each of these tables.

Robust Kronecker Product PCA for Spatio-Temporal Covariance Estimation

- Computer ScienceIEEE Transactions on Signal Processing
- 2015

A robust PCA-based algorithm is introduced to estimate the covariance under the Kronecker PCA model, and an extension to Toeplitz temporal factors is provided, producing a parameter reduction for temporally stationary measurement modeling.

Structure-revealing data fusion

- Computer ScienceBMC Bioinformatics
- 2013

A structure-revealing data fusion model that can jointly analyze heterogeneous, incomplete data sets with shared and unshared components is proposed and its promising performance as well as potential limitations on both simulated and real data are demonstrated.

Sparse permutation invariant covariance estimation

- Computer Science, Mathematics
- 2008

A method for constructing a sparse estimator for the inverse covariance (concentration) matrix in high-dimensional settings using a penalized normal likelihood approach and forces sparsity by using a lasso-type penalty is proposed.

Orthogonal Sparse PCA and Covariance Estimation via Procrustes Reformulation

- Computer ScienceIEEE Transactions on Signal Processing
- 2016

Numerical experiments show that the proposed eigenvector extraction algorithm outperforms existing algorithms in terms of support recovery and explained variance, whereas the covariance estimation algorithms improve the sample covariance estimator significantly.

Gemini: Graph estimation with matrix variate normal instances

- Mathematics, Computer Science
- 2014

This paper develops new methods for estimating the graphical structures and underlying parameters, namely, the row and column covariance and inverse covariance matrices from the matrix variate data and provides simulation evidence showing that one can recover graphical structures as well as estimating the precision matrices, as predicted by theory.

Analysis of multiblock and hierarchical PCA and PLS models

- Computer Science
- 1998

It is recommended that in cases where the variables can be separated into meaningful blocks, the standard PCA and PLS methods be used to build the models and then the weights and loadings of the individual blocks and super block and the percentage variation explained in each block be calculated from the results.