• Corpus ID: 3641284

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

@article{McInnes2018UMAPUM,
  title={UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction},
  author={Leland McInnes and John Healy},
  journal={ArXiv},
  year={2018},
  volume={abs/1802.03426}
}
UMAP (Uniform Manifold Approximation and Projection) is a novel manifold learning technique for dimension reduction. UMAP is constructed from a theoretical framework based in Riemannian geometry and algebraic topology. The result is a practical scalable algorithm that applies to real world data. The UMAP algorithm is competitive with t-SNE for visualization quality, and arguably preserves more of the global structure with superior run time performance. Furthermore, UMAP has no computational… 
UMAP: Uniform Manifold Approximation and Projection
Uniform Manifold Approximation and Projection (UMAP) is a dimension reduction technique that can be used for visualisation similarly to t-SNE, but also for general non-linear dimension reduction.
Progressive Uniform Manifold Approximation and Projection
TLDR
This work presents a progressive algorithm for the Uniform Manifold Approximation and Projection (UMAP), called the Progressive UMAP, which could generate the first approximate projection within a few seconds while also sufficiently capturing the important structures of the high-dimensional dataset.
Uniform Manifold Approximation and Projection (UMAP) and its Variants: Tutorial and Survey
TLDR
This is a tutorial and survey paper on UMAP and its variants where UMAP algorithm is explained, probabilities of neighborhood in the input and embedding spaces, optimization of cost function, training algorithm, derivation of gradients, and supervised and semisupervised embedding by UMAP.
Manifold Learning via Manifold Deflation
TLDR
An embedding method for Riemannian manifolds is derived that iteratively uses single-coordinate estimates to eliminate dimensions from an underlying differential operator, thus "deflating" it and proving its consistency when the coordinates converge to true coordinates.
The mathematics of UMAP
TLDR
A comparison of UMAP embeddings with some other standard dimension reduction algorithms shows that UMAP gives similarly good outputs for visualisation as t-SNE, with a substantially better runtime, and may capture more of the global structure of the data.
Parametric UMAP Embeddings for Representation and Semisupervised Learning
TLDR
This work demonstrates that parametric UMAP performs comparably to its nonparametric counterpart while conferring the benefit of a learned parametric mapping, and explores UMAP as a regularization, constraining the latent distribution of autoencoders, parametrically varying global structure preservation, and improving classifier accuracy for semisupervised learning by capturing structure in unlabeled data.
Parametric UMAP: learning embeddings with deep neural networks for representation and semi-supervised learning
TLDR
It is shown that UMAP loss can be extended to arbitrary deep learning applications, for example constraining the latent distribution of autoencoders, and improving classifier accuracy for semi-supervised learning by capturing structure in unlabeled data.
Uniform Manifold Approximation with Two-phase Optimization
TLDR
Through quantitative experiments, it is found that UMATO outperformed widely used DR techniques in preserving the global structure while pro-ducing competitive accuracy in representing the local structure and is preferable in terms of robustness over diverse initialization methods, number of epochs, and subsampling techniques.
EVALUATING UNIFORM MANIFOLD APPROXIMATION AND PROJECTION FOR DIMENSION REDUCTION AND VISUALIZATION OF POLINSAR FEATURES
TLDR
The results show that UMAP exceeds the capability of PCA and LE in these regards and is competitive with t-SNE.
Efficient Manifold and Subspace Approximations with Spherelets
TLDR
A simple and general alternative to approximating subspaces using a locally linear, and potentially multiscale, dictionary is proposed, which instead uses pieces of spheres, or spherelets, to locally approximate the unknown subspace.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 75 REFERENCES
Mapping a Manifold of Perceptual Observations
TLDR
The isometric feature mapping procedure, or isomap, is able to reliably recover low-dimensional nonlinear structure in realistic perceptual data sets, such as a manifold of face images, where conventional global mapping methods find only local minima.
Deep learning multidimensional projections
TLDR
The approach generates projections with similar characteristics as the learned ones, is computationally two to four orders of magnitude faster than existing projection methods, has no complex-to-set user parameters, handles out-of-sample data in a stable manner, and can be used to learn any projection technique.
Laplacian Eigenmaps for Dimensionality Reduction and Data Representation
TLDR
This work proposes a geometrically motivated algorithm for representing the high-dimensional data that provides a computationally efficient approach to nonlinear dimensionality reduction that has locality-preserving properties and a natural connection to clustering.
Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering
TLDR
The algorithm provides a computationally efficient approach to nonlinear dimensionality reduction that has locality preserving properties and a natural connection to clustering.
A global geometric framework for nonlinear dimensionality reduction.
TLDR
An approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set and efficiently computes a globally optimal solution, and is guaranteed to converge asymptotically to the true structure.
Information Retrieval Perspective to Nonlinear Dimensionality Reduction for Data Visualization
TLDR
A rigorous definition for a specific visualization task is given, resulting in quantifiable goodness measures and new visualization methods, and it is shown empirically that the unsupervised version outperforms existing unsuper supervised dimensionality reduction methods in the visualization task, and the supervised version outper performs existing supervised methods.
Efficient Algorithms for t-distributed Stochastic Neighborhood Embedding
TLDR
Out-of-core randomized principal component analysis (oocPCA) is presented, so that the top principal components of a dataset can be computed without ever fully loading the matrix, hence allowing for t-SNE of large datasets to be computed on resource-limited machines.
Evaluation of UMAP as an alternative to t-SNE for single-cell data
TLDR
Comment on the usefulness of UMAP high-dimensional cytometry and single-cell RNA sequencing, notably highlighting faster runtime and consistency, meaningful organization of cell clusters and preservation of continuums in UMAP compared to t-SNE.
...
1
2
3
4
5
...