• Corpus ID: 52085371

Distil : A Mixed-Initiative Model Discovery System for Subject Matter Experts ( Demo )

  title={Distil : A Mixed-Initiative Model Discovery System for Subject Matter Experts ( Demo )},
  author={Scott Langevin},
We present in-progress work on Distil, a mixed-initiative system to enable non-experts with subject matter expertise to generate data-driven models using an interactive analytic question first workflow. Our approach incorporates data discovery, enrichment, analytic model recommendation, and automated visualization to understand data and models. 

Figures from this paper

Towards Evaluating Exploratory Model Building Process with AutoML Systems

An evaluation methodology is proposed that guides AutoML builders to divide their AutoML system into multiple sub-system components, and helps them reason about each component through visualization of end-users' behavioral patterns and attitudinal data and suggests new insights explaining future design opportunities in the AutoML domain.

Benchmark and Survey of Automated Machine Learning Frameworks

This paper is a combination of a survey on current AutoML methods and a benchmark of popular AutoML frameworks on real data sets to summarize and review important AutoML techniques and methods concerning every step in building an ML pipeline.

Survey on Automated Machine Learning

This survey summarizes the recent developments in academy and industry regarding AutoML and introduces a holistic problem formulation, approaches for solving various subproblems of AutoML, and provides an extensive empirical evaluation of the presented approaches on synthetic and real data.

Semantic Classification of Tabular Datasets via Character-Level Convolutional Neural Networks

The open-source toolkit SIMON, an acronym for Semantic Inference for the Modeling of ONtologies, is presented, which implements the character-level convolutional neural network approach to semantic classification in a user-friendly and scalable/parallelizable fashion.

Useable machine learning for Sentinel-2 multispectral satellite imagery

The Distil system is discussed, an application of the system in the remote sensing domain, and a case study identifying likely locust breeding grounds in Africa from unlabeled 13-channel satellite imagery is discussed.



Mixed-Initiative for Big Data: The Intersection of Human + Visual Analytics + Prediction

The conceptual architecture of a mixed-initiative visual analytics system (MIVAS) is presented and the five key components that make up MIVASs (data wrangling, alternative discovery and comparison, parametric interaction, history tracking and exploration, and system agency and adaptation) are presented.

Predictive Interaction for Data Transformation

Predictive Interaction is presented, a framework for interactive systems that shifts the burden of technical specification from users to algorithms, while preserving human guidance and expressive power.

Mixed-initiative visual analytics using task-driven recommendations

This paper presents candidate design guidelines and introduces the Active Data Environment (ADE) prototype, a spatial workspace supporting the analytic process via task recommendations invoked by inferences about user interactions within the workspace, enabling users to co-reason with the system about their data in a single, spatial workspace.

Predictive Analytics Using a Blackboard-Based Reasoning Agent

  • Jia YueA. RajaW. Ribarsky
  • Computer Science
    2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology
  • 2010
RESIN is described, an AI blackboard-based agent that leverages interactive visualizations and mixed-initiative problem solving to enable analysts to explore and pre-process large amounts of data in order to perform predictive analytics.

Proactive wrangling: mixed-initiative end-user programming of data transformation scripts

A model to proactively suggest data transforms which map input data to a relational format expected by analysis tools is presented, and a metric that scores tables according to type homogeneity, sparsity and the presence of delimiters is proposed.

Articulate: A Semi-automated Model for Translating Natural Language Queries into Meaningful Visualizations

Articulate is an attempt at a semi-automated visual analytic model that is guided by a conversational user interface to allow users to verbally describe and then manipulate what they want to see.

DataTone: Managing Ambiguity in Natural Language Interfaces for Data Visualization

This work model ambiguity throughout the process of turning a natural language query into a visualization and use algorithmic disambiguation coupled with interactive ambiguity widgets to resolve ambiguities by surfacing system decisions at the point where the ambiguity matters.

Abstractive Tabular Dataset Summarization via Knowledge Base Semantic Embeddings

An abstractive summarization method for tabular data which employs a knowledge base semantic embedding to generate the summary and presents experimental results on open data taken from several sources--OpenML, CKAN and data.world--to illustrate the effectiveness of the approach.

The human is the loop: new directions for visual analytics

This work argues for a shift from a 'human in the loop' philosophy for visual analytics to a ‘human is the loop’ viewpoint, where the focus is on recognizing analysts’ work processes, and seamlessly fitting analytics into that existing interactive process.

Enterprise Data Analysis and Visualization: An Interview Study

This work characterize the process of industrial data analysis and document how organizational features of an enterprise impact it, and describes recurring pain points, outstanding challenges, and barriers to adoption for visual analytic tools.