Perturbed factor analysis: Improving generalizability across studies
@article{Roy2019PerturbedFA, title={Perturbed factor analysis: Improving generalizability across studies}, author={Arkaprava Roy and Isaac Lavine and Amy H. Herring and David B. Dunson}, journal={arXiv: Methodology}, year={2019} }
Factor analysis is routinely used for dimensionality reduction. However, a major issue is `brittleness' in which one can obtain substantially different factors in analyzing similar datasets. Factor models have been developed for multi-study data by using additive expansions incorporating common and study-specific factors. However, allowing study-specific factors runs counter to the goal of producing a single set of factors that hold across studies. As an alternative, we propose a class of…
Figures and Tables from this paper
5 Citations
Hierachical Resampling for Bagging in Multi-Study Prediction with Applications to Human Neurochemical Sensing
- Computer SciencebioRxiv
- 2019
We propose the “study strap ensemble,” which combines advantages of two common approaches to fitting prediction models when multiple training datasets (“studies”) are available: pooling studies and…
Optimal Ensemble Construction for Multi-Study Prediction with Applications to COVID-19 Excess Mortality Estimation
- Computer ScienceArXiv
- 2021
It is shown that when little data is available for a country before the onset of the pandemic, leveraging data from other countries can substantially improve prediction accuracy and the method remains competitive with or outperforms multi-study stacking and other earlier methods across a range of between-study heterogeneity levels.
Nonparametric Group Variable Selection with Multivariate Response for Connectome-Based Modeling of Cognitive Scores
- Computer Science
- 2021
The proposed method identifies the important brain regions and nodal attributes for cognitive functioning, as well as identify interesting low-dimensional dependency structures among the cognition related test scores.
Nonparametric Group Variable Selection with Multivariate Response for Connectome-Based Prediction of Cognitive Scores
- Computer Science
- 2021
This article identifies the important brain regions and nodal attributes for cognitive functioning, as well as identify interesting low-dimensional dependency structures among the cognition related test scores.
Bayesian Combinatorial Multi-Study Factor Analysis
- Computer Science
- 2020
Tetris is introduced as a new method for Bayesian combinatorial multi-study factor analysis, which identifies latent factors that can be shared by any combination of studies, and model the subsets of studies that share latent factors with an Indian Buffet Process.
References
SHOWING 1-10 OF 27 REFERENCES
Bayesian multistudy factor analysis for high-throughput biological data
- Computer ScienceThe Annals of Applied Statistics
- 2021
The proposed approach performs very well in a range of different scenarios, and outperforms standard Factor analysis in all the scenarios identifying replicable signal in unsupervised genomic applications.
BAYESIAN MODEL ASSESSMENT IN FACTOR ANALYSIS
- Computer Science
- 2004
This work explores reversible jump MCMC methods that build on sets of parallel Gibbs sampling-based analyses to generate suitable empirical proposal distributions and that address the challenging problem of finding efficient proposals in high-dimensional models.
Sparse Bayesian infinite factor models.
- Computer ScienceBiometrika
- 2011
This work proposes a multiplicative gamma process shrinkage prior on the factor loadings which allows introduction of infinitely many factors, with the loadings increasingly shrunk towards zero as the column index increases, and develops an efficient Gibbs sampler that scales well as data dimensionality increases.
Bayesian time-aligned factor analysis of paired multivariate time series
- Computer ScienceJ. Mach. Learn. Res.
- 2021
A Bayesian dynamic factor modeling framework called Time Aligned Common and Individual Factor Analysis (TACIFA) is proposed that includes uncertainty in time alignment through an unknown warping function and enables efficient computation through a Hamiltonian Monte Carlo (HMC) algorithm.
Non-iterative Joint and Individual Variation Explained
- Computer Science
- 2015
This paper introduces Non-iterative Joint and Individual Variation Explained (Non-iteratives JIVE), capturing both joint and individual variation within each data block, which is robust against the heterogeneity among data blocks without a need for normalization.
JOINT AND INDIVIDUAL VARIATION EXPLAINED (JIVE) FOR INTEGRATED ANALYSIS OF MULTIPLE DATA TYPES.
- Computer ScienceThe annals of applied statistics
- 2013
JIVE quantifies the amount of joint variation between data types, reduces the dimensionality of the data, and provides new directions for the visual exploration of joint and individual structure.
High-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomics
- Computer Science, BiologyJournal of the American Statistical Association
- 2008
These case studies aim to investigate and characterize heterogeneity of structure related to specific oncogenic pathways, as well as links between aggregate patterns in gene expression profiles and clinical biomarkers.
Bayesian Factorizations of Big Sparse Tensors
- Computer ScienceJournal of the American Statistical Association
- 2015
Taking a Bayesian approach, priors are placed on terms in the factorization and an efficient Gibbs sampler for posterior computation is developed and shown to have excellent performance in simulations and several real data applications.