Bayesian Factor Analysis for Inference on Interactions

  title={Bayesian Factor Analysis for Inference on Interactions},
  author={Federico Ferrari and David B. Dunson},
  journal={Journal of the American Statistical Association},
  pages={1521 - 1532}
  • F. Ferrari, D. Dunson
  • Published 25 April 2019
  • Computer Science
  • Journal of the American Statistical Association
Abstract–This article is motivated by the problem of inference on interactions among chemical exposures impacting human health outcomes. Chemicals often co-occur in the environment or in synthetic mixtures and as a result exposure levels can be highly correlated. We propose a latent factor joint model, which includes shared factors in both the predictor and response components while assuming conditional independence. By including a quadratic regression in the latent variables in the response… 
A Bayesian approach to inference is proposed, placing variable selection priors on the different components of the semiparametric model, and a Markov chain Monte Carlo (MCMC) algorithm is developed, effectively reducing dimensionality of the model search.
Estimation and false discovery control for the analysis of environmental mixtures.
The analysis of environmental mixtures is of growing importance in environmental epidemiology, and one of the key goals in such analyses is to identify exposures and their interactions that are
Gene‐gene interaction analysis incorporating network information via a structured Bayesian approach
This study is among the first to identify gene‐gene interactions with the assistance of network selection, while simultaneously accommodating the underlying network structures of both main effects and interactions.
Quantifying the impact of association of environmental mixture in a type 1 and type 2 error balanced framework
In environmental epidemiology, analysis of environmental mixture in association to health effects is gaining popularity. Such models mostly focus on inferences of hypotheses or summarizing strength
Effects of gestational exposures to chemical mixtures on birth weight using Bayesian factor analysis in the Health Outcome and Measures of Environment (HOME) Study
As researchers move beyond the “one chemical at a time” analysis to evaluate mixture effects, several challenges related to collinearity among individual chemicals and providing easily interpretable analysis results have arisen.
Bayesian Data Synthesis and the Utility-Risk Trade-Off for Mixed Epidemiological Data
A cohesive Bayesian framework is introduced for the generation of fully synthetic high dimensional micro datasets of mixed categorical, binary, count, and continuous variables, and a modified data synthesis strategy is designed to target and preserve conditional relationships between various exposures and key outcome variables through regression analysis.
A cohesive Bayesian framework is introduced for the generation of fully synthetic high dimensional micro datasets of mixed categorical, binary, count, and continuous variables, and a modified data synthesis strategy is designed to target and preserve conditional relationships between various exposures and key outcome variables through regression analysis.
Critical Window Variable Selection for Mixtures: Estimating the Impact of Multiple Air Pollutants on Stillbirth
Understanding the role of time-varying pollution mixtures on human health is critical as people are simultaneously exposed to multiple pollutants during their lives. For vulnerable sub-populations
Robust sparse Bayesian infinite factor models
This work proposes a Bayesian factor model for heavy-tailed high-dimensional data based on multivariate Student-$t$ likelihood to obtain better covariance estimation and provides a theoretical result that the posterior of the proposed model is weakly consistent under reasonable conditions.
Powering Research through Innovative Methods for Mixtures in Epidemiology (PRIME) Program: Novel and Expanded Statistical Methods
37 new methods from PRIME projects are reviewed and summarized to enable more informed analyses of environmental mixtures and stress training for early career scientists as well as innovation in statistical methodology as an ongoing need.


Bayesian Structural Equation Modeling
Bayesian factor regression models in the''large p
Bayesian factor regression models with many explanatory variables are discussed, and sparse latent factor models are introduced to induce sparsity in factor loadings matrices to provide a novel approach to variable selection with very many predictors.
Bayesian Gaussian Copula Factor Models for Mixed Data
A novel class of Bayesian Gaussian copula factor models that decouple the latent factors from the marginal distributions is proposed and new theoretical and empirical justifications for using this likelihood in Bayesian inference are provided.
Bayesian inference of epistatic interactions in case-control studies
It is demonstrated that the proposed 'bayesian epistasis association mapping' method is significantly more powerful than existing approaches and that genome-wide case-control epistasis mapping with many thousands of markers is both computationally and statistically feasible.
Sparse Statistical Modelling in Gene Expression Genomics
Traditional Bayesian “variable selection” priors are extended to new hierarchical sparsity priors that are providing substantial practical gains in addressing false discovery and isolating significant gene-specific parameters/effects in highly multivariate studies involving thousands of genes.
Comparison of Approaches in Estimating Interaction and Quadratic Effects of Latent Variables
This article reviews, elaborates and compares several approaches for analyzing nonlinear models with interaction and/or quadratic effects, and finds that whilst the Bayesian and the exact ML approaches produce satisfactory results in all the settings under consideration, they can only produce reasonable results in simple models with large sample sizes.
Sparse Bayesian infinite factor models.
This work proposes a multiplicative gamma process shrinkage prior on the factor loadings which allows introduction of infinitely many factors, with the loadings increasingly shrunk towards zero as the column index increases, and develops an efficient Gibbs sampler that scales well as data dimensionality increases.
Convex Modeling of Interactions With Strong Heredity
  • Asad Haris, D. Witten, N. Simon
  • Computer Science, Mathematics
    Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America
  • 2016
FAMILY is a generalization of several existing methods, such as VANISH, hierNet, the all-pairs lasso, and the lasso using only main effects, formulated as the solution to a convex optimization problem, which is solved using an efficient alternating directions method of multipliers (ADMM) algorithm.
Penalized Interaction Estimation for Ultrahigh Dimensional Quadratic Regression
This article introduces a novel method which allows us to estimate the main effects and interactions separately in high dimensional quadratic regression, and develops an efficient ADMM algorithm to implement the penalized estimation.
High-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomics
These case studies aim to investigate and characterize heterogeneity of structure related to specific oncogenic pathways, as well as links between aggregate patterns in gene expression profiles and clinical biomarkers.