Teach and try: A simple interaction technique for exploratory data modelling by end users

@article{Sarkar2014TeachAT,
  title={Teach and try: A simple interaction technique for exploratory data modelling by end users},
  author={Advait Sarkar and Alan F. Blackwell and Mateja Jamnik and Martin Spott},
  journal={2014 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC)},
  year={2014},
  pages={53-56}
}
The modern economy increasingly relies on exploratory data analysis. Much of this is dependent on data scientists - expert statisticians who process data using statistical tools and programming languages. Our goal is to offer some of this analytical power to end-users who have no statistical training through simple interaction techniques and metaphors. We describe a spreadsheet-based interaction technique that can be used to build and apply sophisticated statistical models such as neural… 

Figures from this paper

Human-Machine Collaboration for Democratizing Data Science

TLDR
A novel framework and system that wants to democratize data science by allowing users to interact with standard spreadsheet software in order to perform and automate various data analysis tasks ranging from data wrangling, data selection, clustering, constraint learning, predictive modeling and auto-completion.

Spreadsheet interfaces for usable machine learning

  • Advait Sarkar
  • Computer Science
    2015 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC)
  • 2015
TLDR
A line of research is presented into using the spreadsheet - already familiar to end-users as a paradigm for data manipulation - as a usable interface which lowers the statistical and computing knowledge barriers to building and using these models.

Constructivist Design for Interactive Machine Learning

TLDR
It is argued that the objectives of interactive machine learning can be interpreted as constructivist, and it is shown how constructivist learning environments pose critical questions for the design of interactiveMachine learning systems.

Interactive visual machine learning in spreadsheets

TLDR
Through a study investigating users' learning barriers while building models using BrainCel, it is found that this approach successfully complements the Teach and Try system to facilitate more complex modelling activities.

A study on machine learning web service

TLDR
A machine learning web service using spreadsheets for non-programmers and with another similar service which is using other interface and other machine learning algorithm.

Interaction with Uncertainty in Visualisations

TLDR
A novel directmanipulation interface for uncertainty in visualisations is presented and it is shown through a user study that the interface enables people to successfully edit and comprehend uncertainty.

The End-User Programming Challenge of Data Wrangling

TLDR
This work characterises data wrangling as a programming problem, in which aggregate data must be restructured in ways that remain consistent with its semantic origins or ontological referents, and recommends the table as a lowest common denominator representational device.

Confidence, command, complexity: metamodels for structured interaction with machine intelligence

TLDR
A speculative discussion of a potential solution: metamodels of machine cognition, where the notion of "correctness" for these programs is now unknown or ill-defined.

From Natural Language to Programming Language

TLDR
The authors propose a new program synthesis framework, dialog-based programming, which interprets natural language descriptions into computer programs without forcing the input formats and show how natural language alleviates challenges for novice programmers to conduct software development, scripting, and verification.

Visual Analytics as End-User Programming

TLDR
The view of visual analytics is described as a form of end-user programming that reduces expertise barriers to analytical tasks, and two projects that aim to make analytical programming more visual are discussed.

References

SHOWING 1-7 OF 7 REFERENCES

Data-centric automated data mining

TLDR
This approach uses a data-centric focus and automated methodologies to make data mining accessible to nonexperts and hides the data mining concepts away from the users thus helping to bridge the conceptual gap usually associated with data mining.

Automating string processing in spreadsheets using input-output examples

TLDR
The design of a string programming/expression language that supports restricted forms of regular expressions, conditionals and loops is described and an algorithm based on several novel concepts for synthesizing a desired program in this language is described from input-output examples.

Scikit-learn: Machine Learning in Python

Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing

Champagne Prototyping: A Research Technique for Early Evaluation of Complex End-User Programming Systems

TLDR
A new evaluation technique, based in part on cognitive dimensions and attention investment, called "Champagne prototyping", is presented, which is an early-evaluation technique that is inexpensive to do, yet features the credibility that comes from being based on the real commercial environment of interest, and from working with real users of the environment.

What you see is what you test: a methodology for testing form-based visual programs

TLDR
A testing methodology for form-based visual programs is presented that is validation driven and incremental, and an interface to the methodology is provided that does not require an understanding of testing theory.

The WEKA data mining software: an update

TLDR
This paper provides an introduction to the WEKA workbench, reviews the history of the project, and, in light of the recent 3.6 stable release, briefly discusses what has been added since the last stable version (Weka 3.4) released in 2003.

R: A language and environment for statistical computing.

Copyright (©) 1999–2012 R Foundation for Statistical Computing. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice