# UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

@article{McInnes2018UMAPUM, title={UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction}, author={Leland McInnes and John Healy}, journal={ArXiv}, year={2018}, volume={abs/1802.03426} }

UMAP (Uniform Manifold Approximation and Projection) is a novel manifold learning technique for dimension reduction. UMAP is constructed from a theoretical framework based in Riemannian geometry and algebraic topology. The result is a practical scalable algorithm that applies to real world data. The UMAP algorithm is competitive with t-SNE for visualization quality, and arguably preserves more of the global structure with superior run time performance. Furthermore, UMAP has no computational…

## Figures and Tables from this paper

## 3,664 Citations

UMAP: Uniform Manifold Approximation and Projection

- Computer ScienceJ. Open Source Softw.
- 2018

Uniform Manifold Approximation and Projection (UMAP) is a dimension reduction technique that can be used for visualisation similarly to t-SNE, but also for general non-linear dimension reduction.…

Progressive Uniform Manifold Approximation and Projection

- Computer ScienceEuroVis
- 2020

This work presents a progressive algorithm for the Uniform Manifold Approximation and Projection (UMAP), called the Progressive UMAP, which could generate the first approximate projection within a few seconds while also sufficiently capturing the important structures of the high-dimensional dataset.

Uniform Manifold Approximation and Projection (UMAP) and its Variants: Tutorial and Survey

- Computer ScienceArXiv
- 2021

This is a tutorial and survey paper on UMAP and its variants where UMAP algorithm is explained, probabilities of neighborhood in the input and embedding spaces, optimization of cost function, training algorithm, derivation of gradients, and supervised and semisupervised embedding by UMAP.

Manifold Learning via Manifold Deflation

- Computer ScienceArXiv
- 2020

An embedding method for Riemannian manifolds is derived that iteratively uses single-coordinate estimates to eliminate dimensions from an underlying differential operator, thus "deflating" it and proving its consistency when the coordinates converge to true coordinates.

The mathematics of UMAP

- Computer Science
- 2019

A comparison of UMAP embeddings with some other standard dimension reduction algorithms shows that UMAP gives similarly good outputs for visualisation as t-SNE, with a substantially better runtime, and may capture more of the global structure of the data.

Parametric UMAP Embeddings for Representation and Semisupervised Learning

- Computer ScienceNeural Computation
- 2021

This work demonstrates that parametric UMAP performs comparably to its nonparametric counterpart while conferring the benefit of a learned parametric mapping, and explores UMAP as a regularization, constraining the latent distribution of autoencoders, parametrically varying global structure preservation, and improving classifier accuracy for semisupervised learning by capturing structure in unlabeled data.

Parametric UMAP: learning embeddings with deep neural networks for representation and semi-supervised learning

- Computer ScienceArXiv
- 2020

It is shown that UMAP loss can be extended to arbitrary deep learning applications, for example constraining the latent distribution of autoencoders, and improving classifier accuracy for semi-supervised learning by capturing structure in unlabeled data.

Uniform Manifold Approximation with Two-phase Optimization

- Computer ScienceArXiv
- 2022

Through quantitative experiments, it is found that UMATO outperformed widely used DR techniques in preserving the global structure while pro-ducing competitive accuracy in representing the local structure and is preferable in terms of robustness over diverse initialization methods, number of epochs, and subsampling techniques.

EVALUATING UNIFORM MANIFOLD APPROXIMATION AND PROJECTION FOR DIMENSION REDUCTION AND VISUALIZATION OF POLINSAR FEATURES

- Computer Science
- 2021

The results show that UMAP exceeds the capability of PCA and LE in these regards and is competitive with t-SNE.

Efficient Manifold and Subspace Approximations with Spherelets

- Computer Science
- 2017

A simple and general alternative to approximating subspaces using a locally linear, and potentially multiscale, dictionary is proposed, which instead uses pieces of spheres, or spherelets, to locally approximate the unknown subspace.

## References

SHOWING 1-10 OF 75 REFERENCES

Mapping a Manifold of Perceptual Observations

- Computer ScienceNIPS
- 1997

The isometric feature mapping procedure, or isomap, is able to reliably recover low-dimensional nonlinear structure in realistic perceptual data sets, such as a manifold of face images, where conventional global mapping methods find only local minima.

Deep learning multidimensional projections

- Computer ScienceInf. Vis.
- 2020

The approach generates projections with similar characteristics as the learned ones, is computationally two to four orders of magnitude faster than existing projection methods, has no complex-to-set user parameters, handles out-of-sample data in a stable manner, and can be used to learn any projection technique.

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation

- Computer ScienceNeural Computation
- 2003

This work proposes a geometrically motivated algorithm for representing the high-dimensional data that provides a computationally efficient approach to nonlinear dimensionality reduction that has locality-preserving properties and a natural connection to clustering.

Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering

- Computer Science, MathematicsNIPS
- 2001

The algorithm provides a computationally efficient approach to nonlinear dimensionality reduction that has locality preserving properties and a natural connection to clustering.

Shift-invariant similarities circumvent distance concentration in stochastic neighbor embedding and variants

- Computer ScienceICCS
- 2011

Type 1 and 2 mixtures of Kullback-Leibler divergences as cost functions in dimensionality reduction based on similarity preservation

- Computer ScienceNeurocomputing
- 2013

A global geometric framework for nonlinear dimensionality reduction.

- Computer ScienceScience
- 2000

An approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set and efficiently computes a globally optimal solution, and is guaranteed to converge asymptotically to the true structure.

Information Retrieval Perspective to Nonlinear Dimensionality Reduction for Data Visualization

- Computer ScienceJ. Mach. Learn. Res.
- 2010

A rigorous definition for a specific visualization task is given, resulting in quantifiable goodness measures and new visualization methods, and it is shown empirically that the unsupervised version outperforms existing unsuper supervised dimensionality reduction methods in the visualization task, and the supervised version outper performs existing supervised methods.

Efficient Algorithms for t-distributed Stochastic Neighborhood Embedding

- Computer ScienceArXiv
- 2017

Out-of-core randomized principal component analysis (oocPCA) is presented, so that the top principal components of a dataset can be computed without ever fully loading the matrix, hence allowing for t-SNE of large datasets to be computed on resource-limited machines.

Evaluation of UMAP as an alternative to t-SNE for single-cell data

- Computer SciencebioRxiv
- 2018

Comment on the usefulness of UMAP high-dimensional cytometry and single-cell RNA sequencing, notably highlighting faster runtime and consistency, meaningful organization of cell clusters and preservation of continuums in UMAP compared to t-SNE.