• Publications
  • Influence
The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization
TLDR
It is found that using larger models and artificial data augmentations can improve robustness on real-world distribution shifts, contrary to claims in prior work.
Concrete Problems in AI Safety
TLDR
A list of five practical research problems related to accident risk, categorized according to whether the problem originates from having the wrong objective function, an objective function that is too expensive to evaluate frequently, or undesirable behavior during the learning process, are presented.
Certified Defenses against Adversarial Examples
TLDR
This work proposes a method based on a semidefinite relaxation that outputs a certificate that for a given network and test input, no attack can force the error to exceed a certain value, providing an adaptive regularizer that encourages robustness against all attacks.
Natural Adversarial Examples
TLDR
This work introduces two challenging datasets that reliably cause machine learning model performance to substantially degrade and curates an adversarial out-of-distribution detection dataset called IMAGENET-O, which is the first out- of-dist distribution detection dataset created for ImageNet models.
Semidefinite relaxations for certifying robustness to adversarial examples
TLDR
A new semidefinite relaxation for certifying robustness that applies to arbitrary ReLU networks is proposed and it is shown that this proposed relaxation is tighter than previous relaxations and produces meaningful robustness guarantees on three different foreign networks whose training objectives are agnostic to the proposed relaxation.
Certified Defenses for Data Poisoning Attacks
TLDR
This work addresses the worst-case loss of a defense in the face of a determined attacker by constructing approximate upper bounds on the loss across a broad family of attacks, for defenders that first perform outlier removal followed by empirical risk minimization.
Sever: A Robust Meta-Algorithm for Stochastic Optimization
TLDR
This work introduces a new meta-algorithm that can take in a base learner such as least squares or stochastic gradient descent, and harden the learner to be resistant to outliers, and finds that in both cases it has substantially greater robustness than several baselines.
The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation
The following organisations are named on the report: Future of Humanity Institute, University of Oxford, Centre for the Study of Existential Risk, University of Cambridge, Center for a New American
Learning from untrusted data
TLDR
An algorithm for robust learning in a very general stochastic optimization setting is provided that has immediate implications for robustly estimating the mean of distributions with bounded second moments, robustly learning mixtures of such distributions, and robustly finding planted partitions in random graphs.
A Benchmark for Anomaly Segmentation
TLDR
The Combined Anomalous Object Segmentation benchmark is introduced, which combines two novel datasets for anomaly segmentation that incorporate both realism and anomaly diversity and improves out-of-distribution detectors on large-scale multi-class datasets and introduces detectors for the previously unexplored setting of multi-label out- of-dist distribution detection.
...
1
2
3
4
5
...