Matthias Hein

Learn More
We investigate the problem of defining Hilbertian metrics resp. positive definite kernels on probability measures, continuing the work in [5]. This type of kernels has shown very good results in text classification and has a wide range of possible applications. In this paper we extend the two-parameter family of Hilbertian metrics of Topsøe such that it now(More)
We present a novel latent embedding model for learning a compatibility function between image and class embeddings, in the context of zero-shot classification. The proposed method augments the state-of-the-art bilinear compatibility model by incorporating latent variables. Instead of learning a single bilinear map, it learns a collection of maps with the(More)
Many problems in machine learning and statistics can be formulated as (generalized) eigenproblems. In terms of the associated optimization problem, computing linear eigenvectors amounts to finding critical points of a quadratic function subject to quadratic constraints. In this paper we show that a certain class of constrained optimization problems with(More)
Given a sample from a probability measure with support on a submanifold in Euclidean space one can construct a neighborhood graph which can be seen as an approximation of the submanifold. The graph Laplacian of such a graph is used in several machine learning methods like semi-supervised learning, dimensionality reduction and clustering. In this paper we(More)
In the machine learning community it is generally believed that graph Laplacians corresponding to a finite sample of data points converge to a continuous Laplace operator if the sample size increases. Even though this assertion serves as a justification for many Laplacianbased algorithms, so far only some aspects of this claim have been rigorously proved.(More)
We consider the problem of denoising a noisily sampled submanifold M in R, where the submanifold M is a priori unknown and we are only given a noisy point sample. The presented denoising algorithm is based on a graph-based diffusion process of the point sample. We analyze this diffusion process using recent results about the convergence of graph Laplacians.(More)
Semi-supervised regression based on the graph Laplacian suffers from the fact that the solution is biased towards a constant and the lack of extrapolating power. Based on these observations, we propose to use the second-order Hessian energy for semi-supervised regression which overcomes both these problems. If the data lies on or close to a low-dimensional(More)
An important form of prior information in clustering comes in form of cannot-link and must-link constraints. We present a generalization of the popular spectral clustering technique which integrates such constraints. Motivated by the recently proposed 1-spectral clustering for the unconstrained problem, our method is based on a tight relaxation of the(More)