Max Vladymyrov

Learn More
Stochastic neighbor embedding (SNE) and related nonlinear manifold learning algorithms achieve high-quality low-dimensional representations of similarity data, but are notoriously slow to train. We propose a generic formulation of embedding algorithms that includes SNE and other existing algorithms, and study their relation with spectral methods and graph(More)
Introduction. Dimensionality reduction is an important task in machine learning. It arrises when there is a need for exploratory analysis of a dataset, to reveal hidden structure of the data, or as a preprocessing step, by extracting low-dimensional features that are useful for nearest-neighbor retrieval, classification, search or other applications, in an(More)
Spectral methods for manifold learning and clustering typically construct a graph weighted with affinities (e.g. Gaussian or shortest-path distances) from a dataset and compute eigenvectors of a graph Laplacian. With large datasets, the eigendecomposition is too expensive, and is usually approximated by solving for a smaller graph defined on a subset of the(More)
Spectral methods for dimensionality reduction and clustering require solving an eigenproblem defined by a sparse affinity matrix. When this matrix is large, one seeks an approximate solution. The standard way to do this is the Nyström method, which first solves a small eigenproblem considering only a subset of landmark points, and then applies an(More)
Gaussian affinities are commonly used in graph-based methods such as spectral clustering or nonlinear embedding. Hinton and Roweis (2003) introduced a way to set the scale individually for each point so that it has a distribution over neighbors with a desired perplexity, or effective number of neighbors. This gives very good affinities that adapt locally to(More)
Nonlinear embedding algorithms such as stochastic neighbor embedding do dimensionality reduction by optimizing an objective function involving similarities between pairs of input patterns. The result is a low-dimensional projection of each input pattern. A common way to define an out-of-sample mapping is to optimize the objective directly over a parametric(More)
For problems of image or video segmentation, where clusters have a complex structure, a leading method is spectral clustering. It works by encoding the similarity between pairs of points into an affinity matrix and applying k-means in its low-order eigenspace, where the clustering structure is enhanced. When the number of points is large, an approximation(More)
Introduction Dimensionality reduction algorithms have long been used either for exploratory analysis of a high-dimensional dataset, to reveal structure such as clustering, or as a preprocessing step, by extracting low-dimensional features that are useful for classification or other tasks. Here we focus on dimensionality reduction algorithms where a dataset(More)