• Corpus ID: 239998155

Provable Lifelong Learning of Representations

@article{Cao2021ProvableLL,
  title={Provable Lifelong Learning of Representations},
  author={Xinyuan Cao and Weiyang Liu and Santosh S. Vempala},
  journal={ArXiv},
  year={2021},
  volume={abs/2110.14098}
}
In lifelong learning, the tasks (or classes) to be learned arrive sequentially over time in arbitrary order. During training, knowledge from previous tasks can be captured and transferred to subsequent ones to improve sample efficiency. We consider the setting where all target tasks can be represented in the span of a small number of unknown linear or nonlinear features of the input data. We propose a provable lifelong learning algorithm that maintains and refines the internal feature… 

Figures and Tables from this paper

References

SHOWING 1-10 OF 43 REFERENCES
Efficient Representations for Lifelong Learning and Autoencoding
TLDR
This work poses and provides efficient algorithms for several natural theoretical formulations of the problem of learning many different target functions over time, that share certain commonalities that are initially unknown to the learning algorithm.
Conditional Meta-Learning of Linear Representations
TLDR
This work proposes a meta-algorithm capable of leveraging a conditioning function, mapping the tasks’ side information into a representation tailored to the task at hand, and yields a new estimator enjoying faster learning rates and requiring less hyper-parameters to tune than current state-of-the-art methods.
Class-incremental learning: survey and performance evaluation
TLDR
This paper provides a complete survey of existing methods for incremental learning, and in particular an extensive experimental evaluation on twelve class-incremental methods, including a comparison of class-increasing methods on multiple large-scale datasets, investigation into small and large domain shifts, and comparison on various network architectures.
How Fine-Tuning Allows for Effective Meta-Learning
TLDR
This work presents a theoretical framework for analyzing a MAML-like algorithm, assuming all available tasks require approximately the same representation, and provides risk bounds on predictors found by finetuning via gradient descent, demonstrating that the method provably leverages the shared structure.
Lifelong Learning with Dynamically Expandable Networks
TLDR
The obtained network fine-tuned on all tasks obtained significantly better performance over the batch models, which shows that it can be used to estimate the optimal network structure even when all tasks are available in the first place.
A Structured Prediction Approach for Conditional Meta-Learning
TLDR
This work proposes task-adaptive structured meta- learning (TASML), a principled estimator that weighs meta-training data conditioned on the target task to design tailored meta-learning objectives and introduces algorithmic improvements to tackle key computational limitations of existing methods.
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning
Continual Lifelong Learning with Neural Networks: A Review
TLDR
This review critically summarize the main challenges linked to lifelong learning for artificial learning systems and compare existing neural network approaches that alleviate, to different extents, catastrophic forgetting.
Orthogonal Gradient Descent for Continual Learning
TLDR
The Orthogonal Gradient Descent (OGD) method is presented, which accomplishes this goal by projecting the gradients from new tasks onto a subspace in which the neural network output on previous task does not change and the projected gradient is still in a useful direction for learning the new task.
iCaRL: Incremental Classifier and Representation Learning
TLDR
iCaRL can learn many classes incrementally over a long period of time where other strategies quickly fail, and distinguishes it from earlier works that were fundamentally limited to fixed data representations and therefore incompatible with deep learning architectures.
...
1
2
3
4
5
...