Share This Author
Gradient Descent Quantizes ReLU Network Features
An analysis of this behavior for feedforward networks with a ReLU activation function under the assumption of small initialization and learning rate and uncover a quantization effect that shows that for given input data there are only finitely many, "simple" functions that can be obtained, independent of the network size.
Deep Learning Through the Lens of Example Difficulty
A measure of the computational difficulty of making a prediction for a given input: the (effective) prediction depth is introduced and surprising yet simple relationships between the prediction depth of a giveninput and the model’s uncertainty, confidence, accuracy and speed of learning for that data point are revealed.
What Do Neural Networks Learn When Trained With Random Labels?
It is shown analytically for convolutional and fully connected networks that an alignment between the principal components of network parameters and data takes place when training with random labels, and how this alignment produces a positive transfer.
The Impact of Reinitialization on Generalization in Convolutional Neural Networks
The accuracy of convolutional neural networks can be improved for small datasets using bottom-up layerwise reinitialization, where the number of reinitialized layers may vary depending on the available compute budget.
Temporal Difference Learning with Neural Networks - Study of the Leakage Propagation Problem
- Hugo Penedones, Damien Vincent, Hartmut Maennel, S. Gelly, Timothy A. Mann, André Barreto
- Computer ScienceArXiv
- 9 July 2018
The issue of approximation errors in areas of sharp discontinuities of the value function being further propagated by bootstrap updates is investigated, and empirical evidence of leakage propagation is shown, and analytically it is shown that it must occur, in a simple Markov chain, when function approximation errors are present.
Adaptive Temporal-Difference Learning for Policy Evaluation with Per-State Uncertainty Estimates
This paper proposes an algorithm that adaptively switches between TD and MC in each state, thus mitigating the propagation of errors and suggesting that learned confidence intervals are a powerful technique for adapting policy evaluation to use TD or MC returns in a data-driven way.
Fourier networks for uncertainty estimates and out-of-distribution detection
Uncertainty estimates and out-of-distribution detection with Sine Networks
- Hartmut Maennel
- Computer Science
A surprising remedy is replacing the usual ReLU (or sigmoid) activation functions by sin(x) and adjusting the initialization, which greatly enhances the ability to detect model uncertainty outside of the training distribution.
Perkolationstheorie: Stochastische Modelle poröser Medien
- Hartmut Maennel
- 1 September 1994
ZusammenfassungEs werden klassische Fragestellungen, Methoden und Ergebnisse der Perkolationstheorie skizziert und es wird eine neue Entwicklung beschrieben, die auf der Idee der konformen Invarianz…
Accurate Machine Learned Quantum-Mechanical Force Fields for Biomolecular Simulations
Molecular dynamics (MD) simulations allow atomistic insights into chemical and biological processes. Accurate MD simulations require computationally demanding quantum-mechanical calculations, being…