Corpus ID: 221970327

Parametric UMAP: learning embeddings with deep neural networks for representation and semi-supervised learning

@article{Sainburg2020ParametricUL,
  title={Parametric UMAP: learning embeddings with deep neural networks for representation and semi-supervised learning},
  author={Tim Sainburg and Leland McInnes and Timothy Q. Gentner},
  journal={ArXiv},
  year={2020},
  volume={abs/2009.12981}
}
We propose Parametric UMAP, a parametric variation of the UMAP (Uniform Manifold Approximation and Projection) algorithm. UMAP is a non-parametric graph-based dimensionality reduction algorithm using applied Riemannian geometry and algebraic topology to find low-dimensional embeddings of structured data. The UMAP algorithm consists of two steps: (1) Compute a graphical representation of a dataset (fuzzy simplicial complex), and (2) Through stochastic gradient descent, optimize a low-dimensional… Expand
Uniform Manifold Approximation and Projection (UMAP) and its Variants: Tutorial and Survey
TLDR
This is a tutorial and survey paper on UMAP and its variants where UMAP algorithm is explained, probabilities of neighborhood in the input and embedding spaces, optimization of cost function, training algorithm, derivation of gradients, and supervised and semisupervised embedding by UMAP. Expand
Using Genetic Programming to Find Functional Mappings for UMAP Embeddings
TLDR
This work proposes utilising UMAP to create functional mappings with genetic programming-based manifold learning and compares two different approaches: one that uses the embedding produced by UMAP as the target for the functional mapping; and the other which directly optimises the UMAP cost function by using it as the fitness function. Expand
UMAP does not reproduce high-dimensional similarities due to negative sampling
TLDR
This work derives UMAP’s effective loss function in closed form and finds that it differs from the published one, and shows that UMAP does not aim to reproduce its theoretically motivated high-dimensional UMAP similarities, and tries to reproduce similarities that only encode the shared k nearest neighbor graph. Expand
Message Passing Adaptive Resonance Theory for Online Active Semi-supervised Learning
We used 4 types of datasets: Mouse retina transcriptomes (Macosko et al., 2015; Poličar et al., 2019), Fashion MNIST (Xiao et al., 2017), EMNIST Letters (Cohen et al., 2017), and CIFAR-10 (KrizhevskyExpand
Quantification of the Immune Content in Neuroblastoma: Deep Learning and Topological Data Analysis in Digital Pathology
TLDR
A novel machine learning framework is introduced to address the issue of the quantitative assessment of the immune content in neuroblastoma (NB) specimens, showing significant agreement between densities estimated by the EUNet model and by trained pathologists, and highlighting novel insights into the dynamics of the intrinsic dataset dimensionality at different stages of the training process. Expand
Non-linear Dimensionality Reduction on Extracellular Waveforms Reveals Physiological, Functional, and Laminar Diversity in Premotor Cortex
Cortical circuits involved in decision-making are thought to contain a large number of cell types— each with different physiological, functional, and laminar distribution properties—that coordinateExpand
Non-linear dimensionality reduction on extracellular waveforms reveals cell type diversity in premotor cortex
TLDR
This work develops a new method (WaveMAP) that combines non-linear dimensionality reduction with graph clustering to identify putative cell types and robustly establishes eight waveform clusters that recapitulate previously identified narrow- and broad-spiking types while revealing previously unknown diversity within these subtypes. Expand
Non-linear Dimensionality Reduction on Extracellular Waveforms Reveals Cell Type Diversity in Premotor Cortex
TLDR
It is shown that non-linear dimensionality reduction with graph clustering applied to the entire extracellular waveform can delineate many different putative cell types and does so in an interpretable manner. Expand
Bioinformatic Analysis of Temporal and Spatial Proteome Alternations During Infections
TLDR
This review details the steps of a typical temporal and spatial analysis, including data pre-processing steps, different statistical and machine learning approaches, validation, interpretation, and the extraction of biological information from mass spectrometry data. Expand
Message Passing Adaptive Resonance Theory for Online Active Semi-supervised Learning
TLDR
This study proposes Message Passing Adaptive Resonance Theory (MPART), a model that infers the class of unlabeled data and selects informative and representative samples through message passing between nodes on the topological graph and significantly outperforms the competitive models in online active learning environments. Expand
...
1
2
...

References

SHOWING 1-10 OF 51 REFERENCES
DIMAL: Deep Isometric Manifold Learning Using Sparse Geodesic Sampling
TLDR
This paper uses the Siamese configuration to train a neural network to solve the problem of least squares multidimensional scaling for generating maps that approximately preserve geodesic distances and shows a significantly improved local and nonlocal generalization of the isometric mapping. Expand
Laplacian Auto-Encoders: An explicit learning of nonlinear data manifold
TLDR
This paper proposes a novel unsupervised manifold learning method termed Laplacian Auto-Encoders (LAEs), which regularizes training of auto-encoders so that the learned encoding function has the locality-preserving property for data points on the manifold. Expand
Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning
TLDR
An unsupervised loss function is proposed that takes advantage of the stochastic nature of these methods and minimizes the difference between the predictions of multiple passes of a training sample through the network. Expand
VAE-SNE: a deep generative model for simultaneous dimensionality reduction and clustering
TLDR
It is found that VAE-SNE produces high-quality compressed representations with results that are on par with existing nonlinear dimensionality reduction algorithms, and can be used for unsupervised action recognition to detect and classify repeated motifs of stereotyped behavior in high-dimensional timeseries data. Expand
UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction
TLDR
The UMAP algorithm is competitive with t-SNE for visualization quality, and arguably preserves more of the global structure with superior run time performance. Expand
Connectivity-Optimized Representation Learning via Persistent Homology
TLDR
This work controls the connectivity of an autoencoder's latent space via a novel type of loss, operating on information from persistent homology, which is differentiable and presents a theoretical analysis of the properties induced by the loss. Expand
Learning a Parametric Embedding by Preserving Local Structure
TLDR
The paper presents a new unsupervised dimensionality reduction technique, called parametric t-SNE, that learns a parametric mapping between the high-dimensional data space and the low-dimensional latent space, and evaluates the performance in experiments on three datasets. Expand
Auto-Encoding Variational Bayes
TLDR
A stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case is introduced. Expand
Parametric nonlinear dimensionality reduction using kernel t-SNE
TLDR
This work proposes an efficient extension of t-SNE to a parametric framework, kernel t-sNE, which preserves the flexibility of basic t- SNE, but enables explicit out-of-sample extensions and demonstrates that this technique yields satisfactory results also for large data sets. Expand
Adam: A Method for Stochastic Optimization
TLDR
This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Expand
...
1
2
3
4
5
...