• Publications
  • Influence
Nonlinear Component Analysis as a Kernel Eigenvalue Problem
A new method for performing a nonlinear form of principal component analysis in high-dimensional feature spaces, related to input space by some nonlinear map. Expand
On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation
We introduce a methodology that allows to visualize the contributions of single pixels to predictions for kernel-based classifiers over Bag of Words features and for multilayered neural networks. Expand
Kernel Principal Component Analysis
A new method for performing a nonlinear form of Principal Component Analysis is proposed. Expand
Efficient BackProp
An introduction to kernel-based learning algorithms
This paper provides an introduction to support vector machines, kernel Fisher discriminant analysis, and kernel principal component analysis, as examples for successful kernel-based learning methods. Expand
Kernel PCA and De-Noising in Feature Spaces
Kernel PCA as a nonlinear feature extractor has proven powerful as a preprocessing step for classification algorithms. Expand
Soft Margins for AdaBoost
We propose several regularization methods and generalizations of the original ADABOOST algorithm to achieve a soft margin. Expand
Input space versus feature space in kernel-based methods
This paper collects some ideas targeted at advancing our understanding of the feature spaces associated with support vector (SV) kernel functions, and shows their utility in two applications of kernel methods. Expand
Single-trial analysis and classification of ERP components — A tutorial
We propose to use shrinkage estimators and show that appropriate regularization of linear discriminant analysis by shrinkage yields excellent results for single-trial ERP classification that are far superior to classical LDA classification. Expand
Explaining nonlinear classification decisions with deep Taylor decomposition
We introduce a novel methodology for interpreting generic multilayer neural networks by decomposing the network classification decision into contributions of its input elements. Expand