• Corpus ID: 226226438

Dataset Meta-Learning from Kernel Ridge-Regression

  title={Dataset Meta-Learning from Kernel Ridge-Regression},
  author={Timothy Nguyen and Zhourung Chen and Jaehoon Lee},
One of the most fundamental aspects of any machine learning algorithm is the training data used by the algorithm. We introduce the novel concept of $\epsilon$-approximation of datasets, obtaining datasets which are much smaller than or are significant corruptions of the original training data while maintaining similar model performance. We introduce a meta-learning algorithm called Kernel Inducing Points (KIP) for obtaining such remarkable datasets, inspired by the recent developments in the… 

Figures and Tables from this paper

Dataset Condensation with Differentiable Siamese Augmentation

Inspired from the recent training set synthesis methods, Differentiable Siamese Augmentation is proposed that enables effective use of data augmentation to synthesize more informative synthetic images and thus achieves better performance when training networks with augmentations.

Federated Learning via Decentralized Dataset Distillation in Resource-Constrained Edge Environments

The experimental results show that FedD3 outperforms other federated learning frameworks in terms of needed communication volumes, while it provides the additional bene to be able to balance the trade-off between accuracy and communication cost, depending on usage scenario or target dataset.

Can we achieve robustness from data alone?

This work devise a meta-learning method for robust classification, that optimizes the dataset prior to its deployment in a principled way, and aims to effectively remove the non-robust parts of the data.

Dataset Distillation using Neural Feature Regression

The proposed algorithm is analogous to truncated backpropagation through time with a pool of models to alleviate various types of overfitting in dataset distillation and outperforms the previous methods on CIFAR100, Tiny ImageNet, and ImageNet-1K.

Privacy for Free: How does Dataset Condensation Help Privacy?

This work for the first time identifies that dataset condensation (DC) which is originally designed for improving training efficiency is also a better solution to replace the traditional data generators for private data generation, thus providing privacy for free.

Dataset Distillation by Matching Training Trajectories

This paper proposes a new formulation that optimizes the authors' distilled data to guide networks to a similar state as those trained on real data across many training steps, and handily outperforms existing methods and also allows us to distill higher-resolution visual data.

Dataset Condensation with Distribution Matching

This work proposes a simple yet effective method that synthesizes condensed images by matching feature distributions of the synthetic and original training images in many sampled embedding spaces and significantly reduces the synthesis cost while achieving comparable or better performance.

Dataset Distillation with Infinitely Wide Convolutional Networks

A novel distributed kernel-based meta-learning framework is applied to achieve state-of-the-art results for dataset distillation using infinitely wide convolutional neural networks to achieve over 64% test accuracy on CIFAR10 image classification task, a dramatic improvement over the previous best test accuracy of 40%.

Dataset Distillation by Matching Training Trajectories

Dataset distillation is the task of synthesizing a small dataset such that a model trained on the synthetic set will match the test accuracy of the model trained on the full dataset. In this paper,

Bidirectional Learning for Offline Infinite-width Model-based Optimization

This work adopts an infinite-width DNN model, and proposes to employ the corresponding neural tangent kernel to yield a closed-form loss for more accurate design updates.



Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks

Results suggesting neural tangent kernels perform strongly on low-data tasks are reported, with comparing the performance of NTK with the finite-width net it was derived from, NTK behavior starts at lower net widths than suggested by theoretical analysis.

Using Small Proxy Datasets to Accelerate Hyperparameter Search

This work aims to generate smaller "proxy datasets" where experiments are cheaper to run but results are highly correlated with experimental results on the full dataset, and these "easy" proxies are higher quality than training on theFull dataset for a reduced number of epochs (but equivalent computational cost), and, unexpectedly, higherquality than proxies constructed from the hardest examples.

Meta-learning with differentiable closed-form solvers

The main idea is to teach a deep network to use standard machine learning tools, such as ridge regression, as part of its own internal model, enabling it to quickly adapt to novel data.

Meta-Learning With Differentiable Convex Optimization

The objective is to learn feature embeddings that generalize well under a linear classification rule for novel categories and this work exploits two properties of linear classifiers: implicit differentiation of the optimality conditions of the convex problem and the dual formulation of the optimization problem.

On Exact Computation with an Infinitely Wide Neural Net

The current paper gives the first efficient exact algorithm for computing the extension of NTK to convolutional neural nets, which it is called Convolutional NTK (CNTK), as well as an efficient GPU implementation of this algorithm.

Neural Kernels Without Tangents

Using well established feature space tools such as direct sum, averaging, and moment lifting, an algebra for creating "compositional" kernels from bags of features is presented that corresponds to many of the building blocks of "neural tangent kernels (NTK).

Convolutional Kernel Networks

This paper proposes a new type of convolutional neural network (CNN) whose invariance is encoded by a reproducing kernel, and bridges a gap between the neural network literature and kernels, which are natural tools to model invariance.

Flexible Dataset Distillation: Learn Labels Instead of Images

This work introduces a more robust and flexible meta-learning algorithm for distillation, as well as an effective first-order strategy based on convex optimization layers, and shows it to be more effective than the prior image-based approach to dataset distillation.

Deep Learning with Differential Privacy

This work develops new algorithmic techniques for learning and a refined analysis of privacy costs within the framework of differential privacy, and demonstrates that deep neural networks can be trained with non-convex objectives, under a modest privacy budget, and at a manageable cost in software complexity, training efficiency, and model quality.

Learning Multiple Layers of Features from Tiny Images

It is shown how to train a multi-layer generative model that learns to extract meaningful features which resemble those found in the human visual cortex, using a novel parallelization algorithm to distribute the work among multiple machines connected on a network.