Invariant Theory and Scaling Algorithms for Maximum Likelihood Estimation

@article{Amendola2021InvariantTA,
  title={Invariant Theory and Scaling Algorithms for Maximum Likelihood Estimation},
  author={Carlos Am'endola and Kathl{\'e}n Kohn and Philipp Reichenbach and Anna Seigal},
  journal={SIAM J. Appl. Algebra Geom.},
  year={2021},
  volume={5},
  pages={304-337}
}
We show that maximum likelihood estimation in statistics is equivalent to finding the capacity in invariant theory, in two statistical settings: log-linear models and Gaussian transformation families.The former includes the classical independence model while the latter includes matrix normal models and Gaussian graphical models given by transitive directed acyclic graphs. We use stability under group actions to characterize boundedness of the likelihood, and existence and uniqueness of the… 

Figures and Tables from this paper

Toric invariant theory for maximum likelihood estimation in log-linear models
We establish connections between invariant theory and maximum likelihood estimation for discrete statistical models. We show that norm minimization over a torus orbit is equivalent to maximum
Application of Univariate Probability Distributions Fitting With Monte Carlo Simulation
In this study, we present a univariate probability distribution through application of the three Sub and Super Exponential heavier-longer and lighter-shorter tails fitting. This univariate family
Principal Components Along Quiver Representations
Quiver representations arise naturally in many areas across mathematics. Here we describe an algorithm for calculating the vector space of sections, or compatible assignments of vectors to vertices,
Logarithmic Voronoi polytopes for discrete linear models
We study logarithmic Voronoi cells for linear statistical models and partial linear models. The logarithmic Voronoi cells at points on such model are polytopes. To any d-dimensional linear model
A No-go Theorem for Robust Acceleration in the Hyperbolic Plane
In recent years there has been significant effort to adapt the key tools and ideas in convex optimization to the Riemannian setting. One key challenge has remained: Is there a Nesterov-like
Negative curvature obstructs acceleration for geodesically convex optimization, even with exact first-order oracles
TLDR
It is shown that acceleration remains unachievable for any deterministic algorithm which receives exact gradient and function-value information (unbounded queries, no noise); for hyperbolic spaces, Riemannian gradient descent is optimal on the class of smooth and geodesically convex functions.
Near optimal sample complexity for matrix and tensor normal models via geodesic convexity
TLDR
This work studies the estimation of the Kronecker factors of the covariance matrix in the matrix and tensor models and shows nonasymptotic bounds for the error achieved by the maximum likelihood estimator (MLE) in several natural metrics.
Ranks of linear matrix pencils separate simultaneous similarity orbits
Abstract. This paper solves the two-sided version and provides a counterexample to the general version of the 2003 conjecture by Hadwin and Larson. Consider evaluations of linear matrix pencils L =
Symmetries in Directed Gaussian Graphical Models
TLDR
An algorithm is presented to present the maximum likelihood estimate (MLE) in an RDAG model, and when the MLE exists is characterised, via linear independence conditions, which relates properties of a graph, and its colouring, to the number of samples needed for theMLE to exist and to be unique.
The maximum likelihood degree of sparse polynomial systems
. We consider statistical models arising from the common set of solutions to a sparse polynomial system with general coefficients. The maximum likelihood degree counts the number of critical points of
...
...

References

SHOWING 1-10 OF 55 REFERENCES
The Hilbert Null-cone on Tuples of Matrices and Bilinear Forms
We describe the null-cone of the representation of G on Mp, where either G  =  SL(W)  ×  SL(V) and M  =  Hom(V,W) (linear maps), or G  =  SL(V) and M is one of the representations S2(V*) (symmetric
Towards a Theory of Non-Commutative Optimization: Geodesic 1st and 2nd Order Methods for Moment Maps and Polytopes
TLDR
This paper initiates a systematic development of a theory of non-commutative optimization, a setting which greatly extends ordinary (Euclidean) convex optimization, and develops two general methods in the geodesic setting, a first order and a second order method which respectively receive first and second order information on the "derivatives" of the function to be optimized.
MODULI OF REPRESENTATIONS OF FINITE DIMENSIONAL ALGEBRAS
IN this paper, we present a framework for studying moduli spaces of finite dimensional representations of an arbitrary finite dimensional algebra A over an algebraically closed field k. (The abelian
Second Series
  • 45(180):515–530,
  • 1994
Alternating minimization
  • scaling algorithms, and the null-cone problem from invariant theory. arXiv:1711.08039
  • 2017
Existence and uniqueness of the Kronecker covariance MLE
In matrix-valued datasets the sampled matrices often exhibit correlations among both their rows and their columns. A useful and parsimonious model of such dependence is the matrix normal model, in
Generalized Iterative Scaling for Log-Linear Models
A Deterministic Polynomial Time Algorithm for Non-commutative Rational Identity Testing
TLDR
A deterministic polynomial time algorithm for testing if a symbolic matrix in non-commuting variables over Q is invertible or not, which efficiently solves several problems in different areas which had only exponential-time algorithms prior to this work.
Groups acting on Gaussian graphical models
TLDR
The structure of the Gaussian graphical models is revealed by explicitly describing, for any undirected graph, the (maximal) matrix group acting on the space of concentration matrices in the model, and the maximal invariant of this group on the sample space is described.
...
...