• Corpus ID: 239016895

MEMO: Test Time Robustness via Adaptation and Augmentation

@article{Zhang2021MEMOTT,
  title={MEMO: Test Time Robustness via Adaptation and Augmentation},
  author={Marvin Zhang and Sergey Levine and Chelsea Finn},
  journal={ArXiv},
  year={2021},
  volume={abs/2110.09506}
}
While deep neural networks can attain good accuracy on in-distribution test points, many applications require robustness even in the face of unexpected perturbations in the input, changes in the domain, or other sources of distribution shift. We study the problem of test time robustification, i.e., using the test input to improve model robustness. Recent prior works have proposed methods for test time adaptation, however, they each introduce additional assumptions, such as access to multiple… 

Figures and Tables from this paper

Efficient Test-Time Model Adaptation without Forgetting
TLDR
An active sample selection criterion is proposed to identify reliable and non-redundant samples, on which the model is updated to minimize the entropy loss for test-time adaptation, and a Fisher regularizer is introduced to constrain important model parameters from drastic changes.
CAFA: Class-Aware Feature Alignment for Test-Time Adaptation
TLDR
A simple yet effective feature alignment loss, termed as Class-Aware Feature Alignment (CAFA), which encourages a model to learn target representations in a class-discriminative manner and effectively mitigates the distribution shifts in test time, simultaneously is proposed.
TTAPS: Test-Time Adaption by Aligning Prototypes using Self-Supervision
TLDR
A novel modification of the self-supervised training algorithm SwAV is proposed that adds the ability to adapt to single test samples and shows the success of the method on the common benchmark dataset CIFAR10-C.
Continual Test-Time Domain Adaptation
TLDR
This work proposes a continual test-time adaptation approach (CoTTA) which stochastically restore a small part of the neurons to the source pre-trained weights during each iteration to help preserve source knowledge in the long-term and demonstrates the effectiveness of the approach on four classification tasks and a segmentation task.
SITA: Single Image Test-time Adaptation
TLDR
A novel approach AugBN is proposed for the SITA setting that requires only forward propagation and is able to achieve significant performance gains compared to directly applying the source model on the target instances, as reflected in extensive experiments and ablation studies.
Re-using Adversarial Mask Discriminators for Test-time Training under Distribution Shifts
TLDR
It is argued that training stable discriminators produces expressive loss functions that the authors can re-use at inference to detect and correct segmentation mistakes and open new research avenues for re-using adversarial discriminators at inference.
Improving Robustness against Real-World and Worst-Case Distribution Shifts through Decision Region Quantification
TLDR
The Decision Region Quantification (DRQ) algorithm is proposed to improve the robustness of any differentiable pre-trained model against both real-world and worst-case distribution shifts in the data.
Domain Generalization: A Survey
TLDR
For the first time, a comprehensive literature review in DG is provided to summarize the developments over the past decade and conduct a thorough review into existing methods and theories.
Test-Time Robust Personalization for Federated Learning
TLDR
This work identifies the pitfalls of existing works under test-time distribution shifts and proposes a novel test- time robust personalization method, namely Federated Test-time Head Ensemble plus tuning (FedTHE+), which illustrates the advancement of FedTHE+ over strong competitors.
Quantifying and Using System Uncertainty in UAV Navigation
TLDR
This paper provides a method to capture the overall system uncertainty in a UAV navigation task and leverages the uncertainty in the system’s output to improve control decisions that positively impact the UAV‘s performance on its task.

References

SHOWING 1-10 OF 54 REFERENCES
Adaptive Risk Minimization: Learning to Adapt to Domain Shift
TLDR
This work considers the problem setting of domain generalization, and introduces the framework of adaptive risk minimization (ARM), in which models are directly optimized for effective adaptation to shift by learning to adapt on the training domains.
Test-Time Classifier Adjustment Module for Model-Agnostic Domain Generalization
TLDR
A new algorithm for domain generalization (DG), test-time template adjuster (T3A), aiming to robustify a model to unknown distribution shift, which stably improves performance on unseen domains across choices of backbone networks, and outperforms existingdomain generalization methods.
Evaluating Prediction-Time Batch Normalization for Robustness under Covariate Shift
TLDR
It is shown that prediction-time batch normalization provides complementary benefits to existing state-of-the-art approaches for improving robustness and combining the two further improves performance, and has mixed results when used alongside pre-training, and does not seem to perform as well under more natural types of dataset shift.
Revisiting Batch Normalization For Practical Domain Adaptation
TLDR
This paper proposes a simple yet powerful remedy, called Adaptive Batch Normalization (AdaBN) to increase the generalization ability of a DNN, and demonstrates that the method is complementary with other existing methods and may further improve model performance.
Tent: Fully Test-Time Adaptation by Entropy Minimization
TLDR
Tent reduces generalization error for image classification on corrupted ImageNet and CIFAR-10/100 and reaches a new state-of-the-art error on ImageNet-C, and optimize the model for confidence as measured by the entropy of its predictions.
The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization
TLDR
It is found that using larger models and artificial data augmentations can improve robustness on real-world distribution shifts, contrary to claims in prior work.
Greedy Policy Search: A Simple Baseline for Learnable Test-Time Augmentation
TLDR
GPS is introduced, a simple but high-performing method for learning a policy of test-time augmentation and it is demonstrated that augmentation policies learned with GPS achieve superior predictive performance on image classification problems, provide better in-domain uncertainty estimation, and improve the robustness to domain shift.
AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty
TLDR
AugMix significantly improves robustness and uncertainty measures on challenging image classification benchmarks, closing the gap between previous methods and the best possible performance in some cases by more than half.
Tailoring: encoding inductive biases by optimizing unsupervised objectives at prediction time
TLDR
This work takes inspiration fromtransductive learning and notes that, after receiving an input but before making a prediction, the authors can fine-tune their models on any unsupervised objective, and forms a nested optimization (similar to those in meta-learning) and trains the models to perform well on the task loss after adapting to the tailoring loss.
TTT++: When Does Self-Supervised Test-Time Training Fail or Thrive?
TLDR
A test-time feature alignment strategy utilizing offline feature summarization and online moment matching, which regularizes adaptation without revisiting training data is introduced, which indicates that storing and exploiting extra information, in addition to model parameters, can be a promising direction towards robust test- time adaptation.
...
...