A Bayesian method for detecting pairwise associations in compositional data

@article{Schwager2017ABM,
  title={A Bayesian method for detecting pairwise associations in compositional data},
  author={Emma Schwager and Himel Mallick and Steffen Ventz and Curtis Huttenhower},
  journal={PLoS Computational Biology},
  year={2017},
  volume={13}
}
Compositional data consist of vectors of proportions normalized to a constant sum from a basis of unobserved counts. The sum constraint makes inference on correlations between unconstrained features challenging due to the information loss from normalization. However, such correlations are of long-standing interest in fields including ecology. We propose a novel Bayesian framework (BAnOCC: Bayesian Analysis of Compositional Covariance) to estimate a sparse precision matrix through a LASSO prior… Expand

Paper Mentions

Variational Inference for sparse network reconstruction from count data
TLDR
This work adopts a latent model where it directly model counts by means of Poisson distributions that are conditional to latent (hidden) Gaussian correlated variables, and shows that this approach is highly competitive with the existing methods on simulation inspired from microbiological data. Expand
A zero inflated log-normal model for inference of sparse microbial association networks
TLDR
A zero-inflated log-normal graphical model is presented specifically aimed at handling “biological” zeros, and significant performance gains are demonstrated over state-of-the-art statistical methods for the inference of microbial association networks. Expand
A zero inflated log-normal model for inference of sparse microbial association networks
TLDR
A zero-inflated log-normal graphical model is presented specifically aimed at handling “biological” zeros, and significant performance gains are demonstrated over state-of-the-art statistical methods for the inference of microbial association networks. Expand
Variational Inference of Sparse Network from Count Data
The problem of network reconstruction from continuous data has been extensively studied and most state of the art methods rely on variants of Gaussian Graphical Models (GGM). GGM are unfortunatelyExpand
A Statistical Model for Describing and Simulating Microbial Community Profiles
TLDR
SparseDOSSA’s performance is demonstrated for accurately modeling human-associated microbial population profiles; generating synthetic communities with controlled population and ecological structures; spiking-in true positive synthetic associations to benchmark analysis methods; and recapitulating an end-to-end mouse microbiome feeding experiment, which represent the most common analysis types in assessment of real microbial community environmental and epidemiological statistics. Expand
Normalization methods for microbial abundance data strongly affect correlation estimates
Consistent normalization of microbial genomic survey count data is fundamental to modern microbiome research. Technical artifacts in these data often obstruct standard comparison of microbialExpand
Uncovering the drivers of host-associated microbiota with joint species distribution modeling
TLDR
A novel extension of joint species distribution models (JSDMs) is introduced which can straightforwardly accommodate and discern between effects such as host phylogeny and traits, recorded covariates like diet and collection sites, among other ecological processes. Expand
BEEM-Static: Accurate inference of ecological interactions from cross-sectional microbiome data
The structure and function of diverse microbial communities is underpinned by ecological interactions that remain uncharacterized. With rapid adoption of next-generation sequencing for studyingExpand
Uncovering the drivers of host‐associated microbiota with joint species distribution modelling
TLDR
A novel extension of joint species distribution models (JSDMs) is introduced which can straightforwardly accommodate and discern between effects such as host phylogeny and traits, recorded covariates such as diet and collection site, among other ecological processes. Expand
BEEM-Static: Accurate inference of ecological interactions from cross-sectional metagenomic data
TLDR
An expectation-maximization algorithm that can be applied to cross-sectional datasets to infer interaction networks based on an ecological model (generalized Lotka-Volterra) and provides new opportunities for mining ecologically interpretable interactions and systems insights from the growing corpus of metagenomic data. Expand
...
1
2
3
...

References

SHOWING 1-10 OF 37 REFERENCES
CCLasso: correlation inference for compositional data through Lasso
TLDR
A novel method called CCLasso based on least squares with [Formula: see text] penalty to infer the correlation network for latent variables of compositional data from metagenomic data is proposed and an effective alternating direction algorithm from augmented Lagrangian method is used to solve the optimization problem. Expand
Efficient estimation of covariance selection models
A Bayesian method is proposed for estimating an inverse covariance matrix from Gaussian data. The method is based on a prior that allows the off-diagonal elements of the inverse covariance matrix toExpand
Inferring Correlation Networks from Genomic Survey Data
TLDR
It is shown that community diversity is the key factor that modulates the acuteness of such compositional effects, and a new approach is developed, called SparCC, which is capable of estimating correlation values from compositional data. Expand
A new approach to null correlations of proportions
Much work on the statistical analysis of compositional data has concentrated on the difficulty of interpreting correlations between proportions with an assortment of tests for nullcorrelations, forExpand
Sparse and Compositionally Robust Inference of Microbial Ecological Networks
TLDR
SParse InversE Covariance Estimation for Ecological Association Inference is presented, a statistical method for the inference of microbial ecological networks from amplicon sequencing datasets that outperforms state-of-the-art methods to recover edges and network properties on synthetic data under a variety of scenarios. Expand
Bayesian Variable Selection in Linear Regression
Abstract This article is concerned with the selection of subsets of predictor variables in a linear regression model for the prediction of a dependent variable. It is based on a Bayesian approach,Expand
Investigating microbial co-occurrence patterns based on metagenomic compositional data
TLDR
A novel method, regularized estimation of the basis covariance based on compositional data (REBACCA), to identify significant co-occurrence patterns by finding sparse solutions to a system with a deficient rank is proposed. Expand
Bayesian Graphical Lasso Models and Efficient Posterior Computation
Recently, the graphical lasso procedure has become popular in estimating Gaussian graphical models. In this paper, we introduce a fully Bayesian treatment of graphical lasso models. We firstExpand
The Bayesian elastic net
Elastic net (Zou and Hastie 2005) is a flexible regularization and variable selection method that uses a mixture of L1 and L2 penalties. It is particularly useful when there are much more predictorsExpand
The horseshoe estimator for sparse signals
This paper proposes a new approach to sparsity, called the horseshoe estimator, which arises from a prior based on multivariate-normal scale mixtures. We describe the estimator's advantages overExpand
...
1
2
3
4
...