This work is able to both analyze the statistical error associated with any global optimum, and prove that a simple algorithm based on projected gradient descent will converge in polynomial time to a small neighborhood of the set of all global minimizers.Expand

Under restricted strong convexity on the loss and suitable regularity conditions on the penalty, it is proved that any stationary point of the composite objective function will lie within statistical precision of the underlying parameter vector.Expand

This work establishes a form of local statistical consistency for the penalized regression estimators under fairly mild conditions on the error distribution, and analysis of the local curvature of the loss function has useful consequences for optimization when the robust regression function and/or regularizer is nonconvex and the objective function possesses stationary points outside the local region.Expand

It is shown that when the error variances are known or estimated to close enough precision, the true DAG is the unique minimizer of the score computed using the reweighted squared l2-loss.Expand

The primal-dual witness proof method may be used to establish variable selection consistency and $\ell_\infty$-bounds for sparse regression problems, even when the loss function and/or regularizer are nonconvex.Expand

In statistical learning theory, generalization error is used to quantify the degree to which a supervised machine learning algorithm may overfit to training data. Recent work [Xu and Raginsky (2017)]… Expand

For certain graph structures, the support of the inverse covariance matrix of indicator variables on the vertices of a graph reflects the conditional independence structure of the graph, and nonasymptotic guarantees for graph selection methods are provided.Expand

The main result, characterizing the precise boundary between success and failure of maximum likelihood estimation when edge weights are drawn from discrete distributions, involves the Renyi divergence of order $\frac{1}{2}$ between the distributions of within-community and between-community edges.Expand

This work derives new bounds for a notion of adversarial risk, characterizing the robustness of binary classifiers and neural network classifiers, and introduces transformations with the property that the risk of the transformed functions upper-bounds the adversarialrisk of the original functions.Expand

A probabilistic analysis of Pólya urns corresponding to the number of uninfected neighbors in specific subtrees of the infection tree is provided to provide an example illustrating the shortcomings of source estimation techniques in settings where the underlying graph is asymmetric.Expand