• Corpus ID: 237485252

Interaction Models and Generalized Score Matching for Compositional Data

@inproceedings{Yu2021InteractionMA,
  title={Interaction Models and Generalized Score Matching for Compositional Data},
  author={Shiqing Yu and Mathias Drton and Ali Shojaie},
  year={2021}
}
Applications such as the analysis of microbiome data have led to renewed interest in statistical methods for compositional data, i.e., multivariate data in the form of probability vectors that contain relative proportions. In particular, there is considerable interest in modeling interactions among such relative proportions. To this end we propose a class of exponential family models that accommodate general patterns of pairwise interaction while being supported on the probability simplex… 

Figures from this paper

References

SHOWING 1-10 OF 27 REFERENCES

KERNEL-PENALIZED REGRESSION FOR ANALYSIS OF MICROBIOME DATA.

TLDR
This paper uses kernel-based methods to show how to incorporate a variety of extrinsic information, such as phylogeny, into penalized regression models that estimate taxonspecific associations with a phenotype or clinical outcome, and shows how this regression framework can be used to address the compositional nature of multivariate predictors comprised of relative abundances.

Regression Analysis for Microbiome Compositional Data

One important problem in microbiome analysis is to identify the bacterial taxa that are associated with a response, where the microbiome data are summarized as the composition of the bacterial taxa

A Logistic Normal Multinomial Regression Model for Microbiome Compositional Data Analysis

TLDR
This work proposes to use an additive logistic normal multinomial regression model to associate the covariates to bacterial composition and develops a Monte Carlo expectation‐maximization algorithm to implement the penalized likelihood estimation.

Generalized Score Matching for Non-Negative Data

TLDR
This paper gives a generalized form of score matching for non-negative data that improves estimation efficiency and addresses an overlooked inexistence problem by generalizing the regularized score matching method of Lin et al. (2016) and improving its theoretical guarantees fornon-negative Gaussian graphical models.

Comparisons of Distance Methods for Combining Covariates and Abundances in Microbiome Studies

TLDR
This study shows that DPCoA is less robust to outliers, and more robust to small noisy fluctuations around zero, than DPCOA.

Graphical Models for Non-Negative Data Using Generalized Score Matching

TLDR
This paper gives a generalized form of score matching for non-negative data that improves estimation efficiency and generalizes the regularized score matching method of Lin et al. (2016) fornon-negative Gaussian graphical models, with improved theoretical guarantees.

Generalized score matching for general domains.

TLDR
This paper applies a natural generalization of score matching to truncated graphical and pairwise interaction models and provides theoretical guarantees for the resulting estimators and generalizes a recently proposed method from bounded to unbounded domains.

Estimation of High-Dimensional Graphical Models Using Regularized Score Matching.

TLDR
It is confirmed that regularized score matching achieves state-of-the-art performance in the Gaussian case and provides a valuable tool for computationally efficient estimation in non-Gaussian graphical models.

Estimation of Non-Normalized Statistical Models by Score Matching

TLDR
While the estimation of the gradient of log-density function is, in principle, a very difficult non-parametric problem, it is proved a surprising result that gives a simple formula that simplifies to a sample average of a sum of some derivatives of the log- density given by the model.

The Generalized Matrix Decomposition Biplot and Its Application to Microbiome Data

TLDR
A novel biplot is proposed that is based on an extension of the SVD, called the generalized matrix decomposition biplot (GMD-biplot), which involves an arbitrary matrix of similarities and the original matrix of variable measures, such as taxon abundances.