• Corpus ID: 244130334

Covariate Shift in High-Dimensional Random Feature Regression

  title={Covariate Shift in High-Dimensional Random Feature Regression},
  author={Nilesh Tripuraneni and Ben Adlam and Jeffrey Pennington},
A significant obstacle in the development of robust machine learning models is covariate shift, a form of distribution shift that occurs when the input distributions of the training and test sets differ while the conditional label distributions remain the same. Despite the prevalence of covariate shift in real-world applications, a theoretical understanding in the context of modern machine learning has remained lacking. In this work, we examine the exact high-dimensional asymptotics of random… 

Overparameterization Improves Robustness to Covariate Shift in High Dimensions

This work examines the exact high-dimensional asymptotics of random feature regression under covariate shift and presents a precise characterization of the limiting test error, bias, and variance in this setting, providing one of the first theoretical explanations for this ubiquitous empirical phenomenon.

A Random Matrix Perspective on Mixtures of Nonlinearities in High Dimensions

This work analyzes the performance of random feature regression with features F = f ( WX + B ) for a random weight matrix W and bias vector B, obtaining exact formulae for the asymptotic training and test errors for data generated by a linear teacher model.


This work examines the high-dimensional asymptotics of random feature regression in the presence of structured data, allowing for arbitrary input correlations and arbitrary alignment between the data and the weights of the target function, and defines a partial order on the space of weight-data alignments and proves that generalization performance improves in response to stronger alignment.

Implicit Regularization or Implicit Conditioning? Exact Risk Trajectories of SGD in High Dimensions

It is proved that the noise from SGD negatively impacts generalization performance, ruling out the possibility of any type of implicit regularization in this context, and the HSGD formalism is adapted to include streaming SGD, which allows for an exact prediction for the excess risk of multi-pass SGD relative to that of streamingSGD (bootstrap risk).

Precise Learning Curves and Higher-Order Scaling Limits for Dot Product Kernel Regression

A peak in the learning curve is observed whenever m ≈ d r /r ! for any integer r , leading to multiple sample-wise descent and nontrivial behavior at multiple scales.

Investigating Power laws in Deep Representation Learning

Inspired by recent advances in theoretical machine learning and vision neuroscience, it is observed that under mild conditions, proximity of α to 1, is strongly correlated to the downstream generalization performance, and α ≈ 1 is a strong indicator of robustness to label noise during fine-tuning.

Predicting Out-of-Distribution Error with the Projection Norm

This work proposes a metric—Projection Norm—to predict a model’s performance on out-of-distribution (OOD) data without access to ground truth labels and finds that Projection Norm is the only approach that achieves non-trivial detection performance on adversarial examples.

Uncertainty-Informed Deep Learning Models Enable High-Confidence Predictions for Digital Histopathology

A novel, clinically-oriented approach to uncertainty quantification (UQ) for whole-slide images, estimating uncertainty using dropout and calculating thresholds on training data to establish cutoffs for lowand high-confidence predictions.