Approximated and User Steerable tSNE for Progressive Visual Analytics

  title={Approximated and User Steerable tSNE for Progressive Visual Analytics},
  author={Nicola Pezzotti and Boudewijn P. F. Lelieveldt and Laurens van der Maaten and Thomas H{\"o}llt and Elmar Eisemann and Anna Vilanova},
  journal={IEEE Transactions on Visualization and Computer Graphics},
Progressive Visual Analytics aims at improving the interactivity in existing analytics techniques by means of visualization as well as interaction with intermediate results. One key method for data analysis is dimensionality reduction, for example, to produce 2D embeddings that can be visualized and analyzed efficiently. t-Distributed Stochastic Neighbor Embedding (tSNE) is a well-suited technique for the visualization of high-dimensional data. tSNE can create meaningful intermediate results… 

Figures from this paper

Multidimensional Projection for Visual Analytics: Linking Techniques with Distortions, Tasks, and Layout Enrichment

This survey provides detailed analysis and taxonomies as to the organization of MDP techniques according to their main properties and traits, discussing the impact of such properties for visual perception and other human factors and providing future research axes to fill discovered gaps in this domain.

Dimensionality-Reduction Algorithms for Progressive Visual Analytics

This thesis presents novel algorithmic solutions that enable integration of non-linear dimensionality-reduction techniques in visual analytics systems and presents several applications that are designed to provide unprecedented analytical capabilities in several domains.

Selective Wander Join: Fast Progressive Visualizations for Data Joins

This paper extended a recent method in online aggregation, called Wander Join, that is optimized for queries that join tables, one of the most computationally expensive operations and applies user interaction techniques that allow the user to view and adjust the convergence rate, providing more transparency and control over the online aggregation process.

The Human User in Progressive Visual Analytics

This work characterizes PVA users by their common roles, their main tasks, and their distinct focus of analysis to help PVA visualization designers in devising systems that are tailored for their specific target users and their characteristics.

Pyramid-based Scatterplots Sampling for Progressive and Streaming Data Visualization

A pyramid-based scatterplot sampling technique that makes use of the density values in the pyramid to guide the sampling at each scale for preserving the relative data densities and outliers and is competitive in quality with state-of-the-art methods.

Steering the Craft: UI Elements and Visualizations for Supporting Progressive Visual Analytics

Interface design guidelines for helping users understand progressively updating results and make early decisions based on progressive estimates are described for exploring Twitter data at scale.

Casting Multiple Shadows: High-Dimensional Interactive Data Visualisation with Tours and Embeddings.

This work presents visual diagnostics for the pragmatic usage of NLDR methods by combining them with a technique called the tour, which can preserve global structure and through user interactions like linked brushing observe where the NLDR view may be misleading.

A Review and Characterization of Progressive Visual Analytics

The review and discussion of PVA presented in this paper address issues and provide a literature collection on this topic, a conceptual characterization of Pva, as well as a consolidated set of practical recommendations for implementing and using PVA-based visual analytics solutions.

Visualizing and Exploring Dynamic High-Dimensional Datasets with LION-tSNE

LION-tSNE (Local Interpolation with Outlier coNtrol) - a novel approach for incorporating new data into tSNE representation based on local interpolation in the vicinity of training data, outlier detection and a special outlier mapping algorithm is proposed.

t-viSNE: Interactive Assessment and Interpretation of t-SNE Projections

t-Distributed Stochastic Neighbor Embedding (t-SNE) for the visualization of multidimensional data has proven to be a popular approach, with successful applications in a wide range of domains.



Progressive Visual Analytics: User-Driven Visual Exploration of In-Progress Analytics

This paper presents an alternative workflow, progressive visual analytics, which enables an analyst to inspect partial results of an algorithm as they become available and interact with the algorithm to prioritize subspaces of interest.

Scalable Optimization of Neighbor Embedding for Visualization

This work demonstrates that the obvious approach of subsampling produces inferior results and proposes a generic approximated optimization technique that reduces the NE optimization cost to O(n log n), and brings "big data" within reach of visualization.

Steerable, Progressive Multidimensional Scaling

This work presents MDSteer, a steerable MDS computation engine and visualization tool that progressively computes an MDS layout and handles datasets of over one million points.

Interactive visualization of streaming data with Kernel Density Estimation

The extension and integration of the statistical concept of Kernel Density Estimation (KDE) in a scatterplot-like visualization for dynamic data at interactive rates is discussed and a GPU-based realization of KDE is presented that leads to interactive frame rates, even for comparably large datasets.

Visual analysis of dimensionality reduction quality for parameterized projections

Cluster Sculptor, an interactive visual clustering system

The Effects of Interactive Latency on Exploratory Visual Analysis

Analyzing verbal data from think-aloud protocols, it is found that increased latency reduces the rate at which users make observations, draw generalizations and generate hypotheses, causing users to shift exploration strategy, in turn affecting performance.

Visualizing Data using t-SNE

A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.

Hierarchical Stochastic Neighbor Embedding

This work introduces Hierarchical Stochastic Neighbor Embedding (Hierarchical‐SNE), a hierarchical representation of the data that incorporates the well‐known mantra of Overview‐First, Details‐On‐Demand in non‐linear dimensionality reduction, and explains how it scales to the analysis of big datasets.

Opening the Black Box: Strategies for Increased User Involvement in Existing Algorithm Implementations

It is concluded that a range of pragmatic options for enabling user involvement in ongoing computations exists on both the visualization and algorithm side and should be used.