• Corpus ID: 52085371

Distil : A Mixed-Initiative Model Discovery System for Subject Matter Experts ( Demo )

  title={Distil : A Mixed-Initiative Model Discovery System for Subject Matter Experts ( Demo )},
  author={Scott Langevin},
We present in-progress work on Distil, a mixed-initiative system to enable non-experts with subject matter expertise to generate data-driven models using an interactive analytic question first workflow. Our approach incorporates data discovery, enrichment, analytic model recommendation, and automated visualization to understand data and models. 

Figures from this paper

Towards Evaluating Exploratory Model Building Process with AutoML Systems
An evaluation methodology is proposed that guides AutoML builders to divide their AutoML system into multiple sub-system components, and helps them reason about each component through visualization of end-users' behavioral patterns and attitudinal data and suggests new insights explaining future design opportunities in the AutoML domain.
Survey on Automated Machine Learning
This survey summarizes the recent developments in academy and industry regarding AutoML and introduces a holistic problem formulation, approaches for solving various subproblems of AutoML, and provides an extensive empirical evaluation of the presented approaches on synthetic and real data.
Semantic Classification of Tabular Datasets via Character-Level Convolutional Neural Networks
The open-source toolkit SIMON, an acronym for Semantic Inference for the Modeling of ONtologies, is presented, which implements the character-level convolutional neural network approach to semantic classification in a user-friendly and scalable/parallelizable fashion.
Useable machine learning for Sentinel-2 multispectral satellite imagery
One of the challenges when building Machine Learning (ML) models using satellite imagery is building sufficiently labeled data sets for training. In the past, this problem has been addressed by
Benchmark and Survey of Automated Machine Learning Frameworks
This paper is a combination of a survey on current AutoML methods and a benchmark of popular AutoML frameworks on real data sets to summarize and review important AutoML techniques and methods concerning every step in building an ML pipeline.


Mixed-Initiative for Big Data: The Intersection of Human + Visual Analytics + Prediction
The conceptual architecture of a mixed-initiative visual analytics system (MIVAS) is presented and the five key components that make up MIVASs (data wrangling, alternative discovery and comparison, parametric interaction, history tracking and exploration, and system agency and adaptation) are presented.
Predictive Interaction for Data Transformation
Predictive Interaction is presented, a framework for interactive systems that shifts the burden of technical specification from users to algorithms, while preserving human guidance and expressive power.
Mixed-initiative visual analytics using task-driven recommendations
This paper presents candidate design guidelines and introduces the Active Data Environment (ADE) prototype, a spatial workspace supporting the analytic process via task recommendations invoked by inferences about user interactions within the workspace, enabling users to co-reason with the system about their data in a single, spatial workspace.
Predictive Analytics Using a Blackboard-Based Reasoning Agent
  • Jia Yue, A. Raja, W. Ribarsky
  • Computer Science
    2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology
  • 2010
RESIN is described, an AI blackboard-based agent that leverages interactive visualizations and mixed-initiative problem solving to enable analysts to explore and pre-process large amounts of data in order to perform predictive analytics.
Proactive wrangling: mixed-initiative end-user programming of data transformation scripts
A model to proactively suggest data transforms which map input data to a relational format expected by analysis tools is presented, and a metric that scores tables according to type homogeneity, sparsity and the presence of delimiters is proposed.
Articulate: A Semi-automated Model for Translating Natural Language Queries into Meaningful Visualizations
Articulate is an attempt at a semi-automated visual analytic model that is guided by a conversational user interface to allow users to verbally describe and then manipulate what they want to see.
DataTone: Managing Ambiguity in Natural Language Interfaces for Data Visualization
This work model ambiguity throughout the process of turning a natural language query into a visualization and use algorithmic disambiguation coupled with interactive ambiguity widgets to resolve ambiguities by surfacing system decisions at the point where the ambiguity matters.
Abstractive Tabular Dataset Summarization via Knowledge Base Semantic Embeddings
An abstractive summarization method for tabular data which employs a knowledge base semantic embedding to generate the summary and presents experimental results on open data taken from several sources--OpenML, CKAN and data.world--to illustrate the effectiveness of the approach.
A Survey of Visual Analytic Pipelines
This paper reviews the previous work on visual analytics pipelines and individual modules from multiple perspectives: data, visualization, model and knowledge, and compares the commonalities and the differences among them.
The human is the loop: new directions for visual analytics
This work argues for a shift from a 'human in the loop' philosophy for visual analytics to a ‘human is the loop’ viewpoint, where the focus is on recognizing analysts’ work processes, and seamlessly fitting analytics into that existing interactive process.