SignSGD can get the best of both worlds: compressed gradients and SGD-level convergence rate, and the momentum counterpart of signSGD is able to match the accuracy and convergence speed of Adam on deep Imagenet models.
First, convolutional self-attention is proposed by producing queries and keys with causal convolution so that local context can be better incorporated into attention mechanism, and LogSparse Transformer is proposed, improving forecasting accuracy for time series with fine granularity and strong long-term dependencies under constrained memory budget.
Black Box Shift Estimation (BBSE) is proposed to estimate the test distribution of p(y) and it is proved BBSE works even when predictors are biased, inaccurate, or uncalibrated, so long as their confusion matrices are invertible.
It is shown that under standard assumptions, getting one sample from a posterior distribution is differentially private "for free"; and this sample as a statistical estimator is often consistent, near optimal, and computationally tractable; and this observations lead to an "anytime" algorithm for Bayesian learning under privacy constraint.
A tight upper bound is provided on the Renyi Differential Privacy (RDP) parameters for algorithms that subsample the dataset, and then apply a randomized mechanism M to the subsample, in terms of the RDP parameters of M and the subsampling probability parameter.
An optimal Gaussian mechanism is developed whose variance is calibrated directly using the Gaussian cumulative density function instead of a tail bound approximation and equipped with a post-processing step based on adaptive estimation techniques by leveraging that the distribution of the perturbation is known.
A family of adaptive estimators on graphs, based on penalizing the $\ell_1$ norm of discrete graph differences, are introduced, which generalizes the idea of trend filtering, used for univariate nonparametric regression, to graphs.
A new algorithm is proposed, termed Low-rank sparse subspace clustering (LRSSC), by the combining SSC and LRR, and theoretical guarantees of the success of the algorithm are developed, revealing interesting insights into the strengths and the weaknesses of the methods.
IEEE Transactions on Pattern Analysis and Machine…
1 April 2014
TLDR
This work addresses all challenges of representative background subtraction techniques in a unified framework which makes little specific assumption of the background, and is able to obtain crisply defined foreground regions, and handles large dynamic background motion much better.
The SWITCH estimator is proposed, which can use an existing reward model to achieve a better bias-variance tradeoff than IPS and DR and prove an upper bound on its MSE and demonstrate its benefits empirically on a diverse collection of data sets, often outperforming prior work by orders of magnitude.