User Modelling for Avoiding Overfitting in Interactive Knowledge Elicitation for Prediction

@article{Daee2018UserMF,
  title={User Modelling for Avoiding Overfitting in Interactive Knowledge Elicitation for Prediction},
  author={Pedram Daee and T. Peltola and Aki Vehtari and Samuel Kaski},
  journal={23rd International Conference on Intelligent User Interfaces},
  year={2018}
}
In human-in-the-loop machine learning, the user provides information beyond that in the training data. Many algorithms and user interfaces have been designed to optimize and facilitate this human--machine interaction; however, fewer studies have addressed the potential defects the designs can cause. Effective interaction often requires exposing the user to the training data or its statistics. The design of the system is then critical, as this can lead to double use of data and overfitting, if… Expand
BEAMES: Interactive Multimodel Steering, Selection, and Inspection for Regression Tasks
TLDR
A technique to allow users to inspect and steer multiple machine learning models from a broader set of learning algorithms and model types is presented and incorporated into a visual analytic prototype, BEAMES, that allows users to perform regression tasks via multimodel steering. Expand
Machine Teaching of Active Sequential Learners
TLDR
The approach gives tools to taking into account strategic (planning) behaviour of users of interactive intelligent systems, such as recommendation engines, by considering them as boundedly optimal teachers. Expand
Gaggle: Visual Analytics for Model Space Navigation
TLDR
Gaggle simplifies working with multiple models by automatically finding the best model from the high-dimensional model space to support various user tasks, and is shown how this approach helps users to find a best model for a classification and ranking task. Expand
Towards an integrative Theoretical Framework of Interactive Machine Learning Systems
TLDR
The main contribution of this paper is organising and structuring the body of knowledge in IML for the advancement of the field and suggesting three opportunities for future IML research. Expand
Knowledge Graph Completion-based Question Selection for Acquiring Domain Knowledge through Dialogues
TLDR
Two modifications to the KGC training are presented, creating pseudo entities having substrings of the names of the entities in the graph so that the entities whose names share substrings are connected and limiting the range of negative sampling, suggesting that the model trained with the modifications is capable of avoiding questions with incorrect content. Expand
Probabilistic Formulation of the Take The Best Heuristic
TLDR
This paper investigates a simple decision making heuristic, Take The Best (TTB), within the framework of cognitively bounded rationality, and formulate TTB as a likelihood-based probabilistic model, where the decision strategy arises by Probabilistic inference based on the training data and the model constraints. Expand
Interactive AI with a Theory of Mind
TLDR
This work argues for formulating human--AI interaction as a multi-agent problem, endowing AI with a computational theory of mind to understand and anticipate the user. Expand
Knowledge Extraction via Decentralized Knowledge Graph Aggregation
TLDR
This work proposes that by gathering insights into what influenced operators' actual parameter choices, tacit process knowledge can be extracted during production in an example-based manner by aggregating to a coherent knowledge graph. Expand
A Conceptual Framework for Personalization of Indoor Comfort Parameters Based on Office Workers' Preferences
TLDR
This paper presents a computational framework for personalization of environmental parameters based on limited office workers’ feedback and proposes that by using current state of the art machine learning methods it is possible to learn the preference model of individuals by employing both the limited feedback and the relevant literature on health-related symptoms. Expand
Ensemble Learning Based Rental Apartment Price Prediction Model by Categorical Features Factoring
TLDR
The results show the accuracy and prediction of the rent of an apartment, also indicates the different types of categorical values that affect the machine learning models. Expand
...
1
2
...

References

SHOWING 1-10 OF 30 REFERENCES
Interactive Elicitation of Knowledge on Feature Relevance Improves Predictions in Small Data Sets
TLDR
The results of a controlled user study show that the user model significantly improves prior knowledge elicitation and prediction accuracy, when predicting the relative citation counts of scientific documents in a specific domain. Expand
Knowledge elicitation via sequential probabilistic inference for high-dimensional prediction
TLDR
This work proposes an algorithm and computational approximation for fast and efficient interaction, which sequentially identifies the most informative features on which to query expert knowledge to improve predictions in sparse linear regression. Expand
Interactive optimization for steering machine classification
TLDR
ManiMatrix is presented, a system that provides controls and visualizations that enable system builders to refine the behavior of classification systems in an intuitive manner and results show that users are able to quickly and effectively modify decision boundaries of classifiers to tai-lor thebehavior of classifier to problems at hand. Expand
Interactive Prior Elicitation of Feature Similarities for Small Sample Size Prediction
TLDR
Evaluation of the method in experiments with simulated and real users on text data confirm that prior elicitation of feature similarities improves prediction accuracy and elicitation with an interactive scatterplot display outperforms straightforward elicitation where the users choose feature pairs from a feature list. Expand
Power to the People: The Role of Humans in Interactive Machine Learning
TLDR
It is argued that the design process for interactive machine learning systems should involve users at all stages: explorations that reveal human interaction patterns and inspire novel interaction methods, as well as refinement stages to tune details of the interface and choose among alternatives. Expand
INFUSE: Interactive Feature Selection for Predictive Modeling of High Dimensional Data
TLDR
INFUSE is a novel visual analytics system designed to help analysts understand how predictive features are being ranked across feature selection algorithms, cross-validation folds, and classifiers, and it is demonstrated how the system can lead to important insights in a case study involving clinical researchers predicting patient outcomes from electronic medical records. Expand
Active Learning Literature Survey
TLDR
This report provides a general introduction to active learning and a survey of the literature, including a discussion of the scenarios in which queries can be formulated, and an overview of the query strategy frameworks proposed in the literature to date. Expand
Statistical Methods for Eliciting Probability Distributions
Elicitation is a key task for subjectivist Bayesians. Although skeptics hold that elicitation cannot (or perhaps should not) be done, in practice it brings statisticians closer to their clients andExpand
Interactive machine learning for health informatics: when do we need the human-in-the-loop?
Machine learning (ML) is the fastest growing field in computer science, and health informatics is among the greatest challenges. The goal of ML is to develop algorithms which can learn and improveExpand
Controlling Bias in Adaptive Data Analysis Using Information Theory
TLDR
A general information-theoretic framework to quantify and provably bound the bias and other statistics of an arbitrary adaptive analysis process is proposed, and it is proved that the mutual information based bound is tight in natural models. Expand
...
1
2
3
...