• Corpus ID: 226226438

# Dataset Meta-Learning from Kernel Ridge-Regression

@article{Nguyen2021DatasetMF,
title={Dataset Meta-Learning from Kernel Ridge-Regression},
author={Timothy Nguyen and Zhourung Chen and Jaehoon Lee},
journal={ArXiv},
year={2021},
volume={abs/2011.00050}
}
• Published 30 October 2020
• Computer Science
• ArXiv
One of the most fundamental aspects of any machine learning algorithm is the training data used by the algorithm. We introduce the novel concept of $\epsilon$-approximation of datasets, obtaining datasets which are much smaller than or are significant corruptions of the original training data while maintaining similar model performance. We introduce a meta-learning algorithm called Kernel Inducing Points (KIP) for obtaining such remarkable datasets, inspired by the recent developments in the…
32 Citations

## Figures and Tables from this paper

### Dataset Condensation with Differentiable Siamese Augmentation

• Computer Science
ICML
• 2021
Inspired from the recent training set synthesis methods, Differentiable Siamese Augmentation is proposed that enables effective use of data augmentation to synthesize more informative synthetic images and thus achieves better performance when training networks with augmentations.

### Federated Learning via Decentralized Dataset Distillation in Resource-Constrained Edge Environments

• Computer Science
ArXiv
• 2022
The experimental results show that FedD3 outperforms other federated learning frameworks in terms of needed communication volumes, while it provides the additional bene to be able to balance the trade-off between accuracy and communication cost, depending on usage scenario or target dataset.

### Can we achieve robustness from data alone?

• Computer Science
ArXiv
• 2022
This work devise a meta-learning method for robust classiﬁcation, that optimizes the dataset prior to its deployment in a principled way, and aims to effectively remove the non-robust parts of the data.

### Dataset Distillation using Neural Feature Regression

• Computer Science
ArXiv
• 2022
The proposed algorithm is analogous to truncated backpropagation through time with a pool of models to alleviate various types of overﬁtting in dataset distillation and outperforms the previous methods on CIFAR100, Tiny ImageNet, and ImageNet-1K.

• Computer Science
ICML
• 2022
This work for the first time identifies that dataset condensation (DC) which is originally designed for improving training efficiency is also a better solution to replace the traditional data generators for private data generation, thus providing privacy for free.

### Dataset Distillation by Matching Training Trajectories

• Computer Science
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
• 2022
This paper proposes a new formulation that optimizes the authors' distilled data to guide networks to a similar state as those trained on real data across many training steps, and handily outperforms existing methods and also allows us to distill higher-resolution visual data.

### Dataset Condensation with Distribution Matching

• Computer Science
ArXiv
• 2021
This work proposes a simple yet effective method that synthesizes condensed images by matching feature distributions of the synthetic and original training images in many sampled embedding spaces and signiﬁcantly reduces the synthesis cost while achieving comparable or better performance.

### Dataset Distillation with Infinitely Wide Convolutional Networks

• Computer Science
NeurIPS
• 2021
A novel distributed kernel-based meta-learning framework is applied to achieve state-of-the-art results for dataset distillation using infinitely wide convolutional neural networks to achieve over 64% test accuracy on CIFAR10 image classification task, a dramatic improvement over the previous best test accuracy of 40%.

### Dataset Distillation by Matching Training Trajectories

• Computer Science
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
• 2022
Dataset distillation is the task of synthesizing a small dataset such that a model trained on the synthetic set will match the test accuracy of the model trained on the full dataset. In this paper,

### Bidirectional Learning for Offline Infinite-width Model-based Optimization

• Computer Science
ArXiv
• 2022
This work adopts an inﬁnite-width DNN model, and proposes to employ the corresponding neural tangent kernel to yield a closed-form loss for more accurate design updates.

## References

SHOWING 1-10 OF 44 REFERENCES

### Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks

• Computer Science
ICLR
• 2020
Results suggesting neural tangent kernels perform strongly on low-data tasks are reported, with comparing the performance of NTK with the finite-width net it was derived from, NTK behavior starts at lower net widths than suggested by theoretical analysis.

### Using Small Proxy Datasets to Accelerate Hyperparameter Search

• Computer Science, Environmental Science
ArXiv
• 2019
This work aims to generate smaller "proxy datasets" where experiments are cheaper to run but results are highly correlated with experimental results on the full dataset, and these "easy" proxies are higher quality than training on theFull dataset for a reduced number of epochs (but equivalent computational cost), and, unexpectedly, higherquality than proxies constructed from the hardest examples.

### Meta-learning with differentiable closed-form solvers

• Computer Science
ICLR
• 2019
The main idea is to teach a deep network to use standard machine learning tools, such as ridge regression, as part of its own internal model, enabling it to quickly adapt to novel data.

### Meta-Learning With Differentiable Convex Optimization

• Computer Science
2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
• 2019
The objective is to learn feature embeddings that generalize well under a linear classification rule for novel categories and this work exploits two properties of linear classifiers: implicit differentiation of the optimality conditions of the convex problem and the dual formulation of the optimization problem.

### On Exact Computation with an Infinitely Wide Neural Net

• Computer Science
NeurIPS
• 2019
The current paper gives the first efficient exact algorithm for computing the extension of NTK to convolutional neural nets, which it is called Convolutional NTK (CNTK), as well as an efficient GPU implementation of this algorithm.

### Neural Kernels Without Tangents

• Computer Science
ICML
• 2020
Using well established feature space tools such as direct sum, averaging, and moment lifting, an algebra for creating "compositional" kernels from bags of features is presented that corresponds to many of the building blocks of "neural tangent kernels (NTK).

### Convolutional Kernel Networks

• Computer Science
NIPS
• 2014
This paper proposes a new type of convolutional neural network (CNN) whose invariance is encoded by a reproducing kernel, and bridges a gap between the neural network literature and kernels, which are natural tools to model invariance.

### Flexible Dataset Distillation: Learn Labels Instead of Images

• Computer Science
ArXiv
• 2020
This work introduces a more robust and flexible meta-learning algorithm for distillation, as well as an effective first-order strategy based on convex optimization layers, and shows it to be more effective than the prior image-based approach to dataset distillation.

### Deep Learning with Differential Privacy

• Computer Science
CCS
• 2016
This work develops new algorithmic techniques for learning and a refined analysis of privacy costs within the framework of differential privacy, and demonstrates that deep neural networks can be trained with non-convex objectives, under a modest privacy budget, and at a manageable cost in software complexity, training efficiency, and model quality.

### Learning Multiple Layers of Features from Tiny Images

It is shown how to train a multi-layer generative model that learns to extract meaningful features which resemble those found in the human visual cortex, using a novel parallelization algorithm to distribute the work among multiple machines connected on a network.