• Publications
  • Influence
Understanding Deep Neural Networks with Rectified Linear Units
The gap theorems hold for smoothly parametrized families of "hard" functions, contrary to countable, discrete families known in the literature, and a new lowerbound on the number of affine pieces is shown, larger than previous constructions in certain regimes of the network architecture.
Maximal Lattice-Free Convex Sets in Linear Subspaces
A model that arises in integer programming is considered and it is shown that all irredundant inequalities are obtained from maximal lattice-free convex sets in an affine subspace and that these sets are polyhedra.
Experiments with Two-Row Cuts from Degenerate Tableaux
The issue of reliability versus aggressiveness of the cut generators is considered, an issue that is usually not addressed in the literature, and the conclusion of whether these cuts are competitive with Gomory mixed-integer cuts is very sensitive to the experimental setup.
Sparse Coding and Autoencoders
It is proved that a layer of ReLU gates can be set up to automatically recover the support of the sparse codes when the data generative model is that of “Sparse Coding”/“Dictionary Learning”.
Distributed localization using noisy distance and angle information
This paper proposes distributed, iterative methods, which are provably convergent to the centralized algorithm solutions, and gives simulation results for the distributed algorithms, evaluating the convergence rate, dependence on measurement noises, and robustness to link dynamics.
Minimal Inequalities for an Infinite Relaxation of Integer Programs
It is shown that maximal S-free convex sets are polyhedra when S is the set of integral points in some rational polyhedron of $\mathbb{R}^n$ and the theorem has implications in integer programming.
On the relative strength of split, triangle and quadrilateral cuts
It is shown that, in a well defined sense, triangle inequalities provide a good approximation of the integer hull, while split inequalities may be arbitrarily bad.
Convergence guarantees for RMSProp and ADAM in non-convex optimization and their comparison to Nesterov acceleration on autoencoders
This work gives proofs that these adaptive gradient algorithms are guaranteed to reach criticality for smooth non-convex objectives and gives bounds on the running time and designs experiments to compare the performances of RMSProp and ADAM against Nesterov Accelerated Gradient method.
Steiner Point Removal in Graph Metrics
Given a family of graphsF , and graphG ∈ F with weights on the edges, the vertices of G are partitioned intoterminalsT and Steiner nodes S. The shortest paths (according to edge weights) define a