NeuralCubes: Deep Representations for Visual Data Exploration

@article{Wang2018NeuralCubesDR,
  title={NeuralCubes: Deep Representations for Visual Data Exploration},
  author={Zhe Wang and Dylan Cashman and Mingwei Li and Jixian Li and Matthew Berger and Joshua A. Levine and Remco Chang and Carlos Eduardo Scheidegger},
  journal={2021 IEEE International Conference on Big Data (Big Data)},
  year={2018},
  pages={550-561}
}
Visual exploration of large multidimensional datasets has seen tremendous progress in recent years, allowing users to express rich data queries that produce informative visual summaries, all in real time. [] Key Method NeuralCubes learns a function that takes as input a given query, for instance, a geographic region and temporal interval, and outputs the result of the query. The learned function serves as a real-time, low-memory approximator for aggregation queries. NeuralCubes models are small enough to be…

Approximate Query Processing for Data Exploration using Deep Generative Models

This work uses deep generative models, an unsupervised learning based approach, to learn the data distribution faithfully such that aggregate queries could be answered approximately by generating samples from the learned model.

ML-AQP: Query-Driven Approximate Query Processing based on Machine Learning

This work offers a solution that can provide approximate answers to aggregate queries, relying on Machine Learning (ML), which is able to work alongside Cloud systems, having low response times and monetary/computational costs and energy footprint.

MAP-Vis: A Distributed Spatio-Temporal Big Data Visualization Framework Based on a Multi-Dimensional Aggregation Pyramid Model

An open-source distributed visualization framework based on a generic multi-dimensional aggregation pyramid model based on two well-known graphics concepts, namely the Spatio-temporal Cube and 2D Tile Pyramid is developed.

IDLat: An Importance-Driven Latent Generation Method for Scientific Data

A novel importance-driven latent representation to facilitate domain-interest-guided scientific data visualization and analysis and qualitatively and quantitatively evaluate the effectiveness and efficiency of latent representations generated by the method with data from multiple scientific visualization applications.

A Visual Analytics System for Profiling Urban Land Use Evolution

This paper presents a meta-modelling system that automates the very labor-intensive and therefore time-heavy and expensive and therefore expensive and expensive process of computer programming called “hibernation”.

Approximate Query Processing using Deep Generative Models

This work uses deep generative models, an unsupervised learning based approach, to learn the data distribution faithfully such that aggregate queries could be answered approximately by generating samples from the learned model.

References

SHOWING 1-10 OF 75 REFERENCES

Approximate Query Processing for Data Exploration using Deep Generative Models

This work uses deep generative models, an unsupervised learning based approach, to learn the data distribution faithfully such that aggregate queries could be answered approximately by generating samples from the learned model.

Nanocubes for Real-Time Exploration of Spatiotemporal Datasets

This work shows how to construct a data cube that fits in a modern laptop's main memory, even for billions of entries, and calls this data structure a nanocube; it can be used to generate well-known visual encodings such as heatmaps, histograms, and parallel coordinate plots.

Hashedcubes: Simple, Low Memory, Real-Time Visual Exploration of Big Data

The algorithms to build and query Hashedcubes are described, and how it can drive well-known interactive visualizations such as binned scatterplots, linked histograms and heatmaps, and the typical query is answered fast enough to easily sustain a interaction.

imMens: Real‐time Visual Querying of Big Data

Methods for interactive visualization of big data, following the principle that perceptual and interactive scalability should be limited by the chosen resolution of the visualized data, not the number of records are presented.

Gaussian Cubes: Real-Time Modeling for Visual Exploration of Large Multidimensional Datasets

Gaussian Cubes is contributed, which significantly improves on state-of-the-art systems by providing interactive modeling capabilities, which include but are not limited to linear least squares and principal components analysis (PCA).

DeepEyes: Progressive Visual Analytics for Designing Deep Neural Networks

This paper presents DeepEyes, a Progressive Visual Analytics system that supports the design of neural networks during training, and presents novel visualizations, supporting the identification of layers that learned a stable set of patterns and, therefore, are of interest for a detailed analysis.

TopKube: A Rank-Aware Data Cube for Real-Time Exploration of Spatiotemporal Data

The computational challenges in building a real-time visual exploratory tool for finding top-ranked objects are described; the recent work involving in-memory and rank-aware data cubes to propose TopKube: a data structure that answers top-k queries up to one order of magnitude faster than the previous state of the art are described.

Dynamic Prefetching of Data Tiles for Interactive Visualization

In this paper, we present ForeCache, a general-purpose tool for exploratory browsing of large datasets. ForeCache utilizes a client-server architecture, where the user interacts with a lightweight

Sample + Seek: Approximating Aggregates with Distribution Precision Guarantee

A novel sampling scheme called measure-biased sampling is proposed to address the main challenges to provide rigorous error guarantees and to handle arbitrary highly selective predicates without maintaining large-sized samples and two new indexes to augment in-memory samples are proposed.

BlinkDB: queries with bounded errors and bounded response times on very large data

BlinkDB allows users to trade-off query accuracy for response time, enabling interactive queries over massive data by running queries on data samples and presenting results annotated with meaningful error bars.
...