• Corpus ID: 53298673

VizRec: A framework for secure data exploration via visual representation

  title={VizRec: A framework for secure data exploration via visual representation},
  author={Lorenzo De Stefani and Leonhard F. Spiegelberg and Tim Kraska and Eli Upfal},
Visual representations of data (visualizations) are tools of great importance and widespread use in data analytics as they provide users visual insight to patterns in the observed data in a simple and effective way. However, since visualizations tools are applied to sample data, there is a a risk of visualizing random fluctuations in the sample rather than a true pattern in the data. This problem is even more significant when visualization is used to identify interesting patterns among many… 
1 Citations

Figures from this paper

moreThanANOVA: A user-friendly Shiny/R application for exploring and comparing data with interactive visualization

In the case of comparing means of various groups, data exploration and comparison for affecting factors or relative indices would be involved. This process is not only complex requiring extensive



DeepEye: An automatic big data visualization framework

The DEEPEYE system solves the first challenge by training a binary classifier to decide whether a particular visualization is good for a given dataset, and by using a supervised learning to rank model to rank the above good visualizations.

Investigating the Effect of the Multiple Comparisons Problem in Visual Analysis

It is shown how a confirmatory analysis approach that accounts for all visual comparisons, insights and non-insights, can achieve similar results as one that requires a validation dataset.

MuVE: Efficient Multi-Objective View Recommendation for Visual Data Exploration

The proposed MuVE scheme for Multi-Objective View Recommendation for Visual Data Exploration introduces a hybrid multi-objective utility function, which captures the impact of binning on the utility of visualizations.

Rapid Sampling for Visualizations with Ordering Guarantees

This paper formally shows that its sampling algorithms are generally applicable and provably optimal in theory, in that they do not take more samples than necessary to generate the visualizations with ordering guarantees and work well in practice, correctly ordering output groups while taking orders of magnitude fewer samples and much less time than conventional sampling schemes.

A Rank-by-Feature Framework for Interactive Exploration of Multidimensional Data

A set of principles and a novel rank-by-feature framework that could enable users to better understand distributions in one (1D) or two dimensions (2D) and discover relationships, clusters, gaps, outliers, and other features and implemented in the Hierarchical Clustering Explorer.

The eyes have it: a task by data type taxonomy for information visualizations

  • B. Shneiderman
  • Computer Science
    Proceedings 1996 IEEE Symposium on Visual Languages
  • 1996
A task by data type taxonomy with seven data types and seven tasks (overview, zoom, filter, details-on-demand, relate, history, and extracts) is offered.

How Much Does Your Data Exploration Overfit? Controlling Bias via Information Usage

A general information usage framework is proposed to quantify and provably bound the bias and other error metrics of an arbitrary exploratory analysis, and it is proved that the mutual information based bound is tight in natural settings.

SEEDB: Automatically Generating Query Visualizations

This work demonstrates SeeDB, a system that partially automates this task: given a query, SeeDB explores the space of all possible visualizations, and automatically identifies and recommends to the analyst those visualizations it finds to be most "interesting" or "useful".

Voyager 2: Augmenting Visual Analysis with Partial View Specifications

This work presents Voyager 2, a mixed-initiative system that blends manual and automated chart specification to help analysts engage in both open-ended exploration and targeted question answering and contributes two partial specification interfaces.

Controlling False Discoveries During Interactive Data Exploration

This work proposes a solution to integrate the control of multiple hypothesis testing into interactive data exploration systems and discusses a set of new control procedures that are better suited for this task and integrates them in the system, QUDE.