Data-Driven Hint Generation in Vast Solution Spaces: a Self-Improving Python Programming Tutor

  title={Data-Driven Hint Generation in Vast Solution Spaces: a Self-Improving Python Programming Tutor},
  author={Kelly Rivers and K. Koedinger},
  journal={International Journal of Artificial Intelligence in Education},
  • Kelly Rivers, K. Koedinger
  • Published 1 March 2017
  • Computer Science
  • International Journal of Artificial Intelligence in Education
To provide personalized help to students who are working on code-writing problems, we introduce a data-driven tutoring system, ITAP (Intelligent Teaching Assistant for Programming. [] Key Method We provide a detailed description of the system’s implementation and perform a technical evaluation on a small set of data to determine the effectiveness of the component algorithms and ITAP’s potential for self-improvement. The results show that ITAP is capable of producing hints for almost any given state after…
Generating Data-driven Hints for Open-ended Programming
A new data-driven algorithm is presented, based on the Hint Factory, to generate hints for open-ended programming assignments, that can provide hints that successfully lead students to solutions from any state, help students achieve assignment objectives, and align with the student’s future solution.
Automated Data-Driven Hint Generation in Intelligent Tutoring Systems for Code-Writing: On the Road of Future Research
The goal in this paper is to review and classify analysis techniques that are requested to generate data-driven hints in ITSs for programming, and to identify the possible future directions in this research field.
A Classification of Data-Driven Hint Generation Techniques for Code-Writing Intelligent Tutoring Systems
  • H. Bui
  • Computer Science, Education
  • 2017
The goal in this paper is to review and classify analysis techniques that are requested to generate data-driven hints in ITSs for programming, and to propose several future research directions.
Automated Data-Driven Hints for Computer Programming Students
This paper presents an approach for generating hints using previous student data, and shows that it can generate various types of hints for over 90% of students with data from only 10 students, and hence, reduce the cold-start problem.
Automated Data-Driven Hint Generation for Learning Programming
Intelligent tutoring systems can provide personalized feedback to students automatically, but they can take large amounts of time and expert knowledge to build, especially when determining how to give students hints.
The Impact of Data Quantity and Source on the Quality of Data-Driven Hints for Programming
It is found that with student training data, hint quality stops improving after 15–20 training solutions and can decrease with additional data, and that student data outperforms a single expert solution but that a comprehensive set of expert solutions generally performs best.
The Continuous Hint Factory - Providing Hints in Vast and Sparsely Populated Edit Distance Spaces
This contribution provides a mathematical framework for edit-based hint policies and proposes a novel hint policy to provide edit hints in vast and sparsely populated state spaces and demonstrates that the Continuous Hint Factory can predict more accurately what capable students would do compared to existing prediction schemes on two learning tasks.
iSnap: Towards Intelligent Tutoring in Novice Programming Environments
Results from a pilot study of iSnap are shared, indicating that students are generally willing to use hints and that hints can create positive outcomes, and some key challenges encountered in the pilot study are highlighted.
Exploring Design Choices in Data-driven Hints for Python Programming Homework
This paper presents CodeChecker, a system which generates hints automatically using student data, and incorporates them into an existing CS1 online homework environment, used by over 1000 students per semester.
A Survey of Automated Programming Hint Generation: The HINTS Framework
All hint techniques can be understood as a series of simpler components with similar properties, and a simple framework for describing such techniques is presented, the Hint Iteration by Narrow-down and Transformation Steps (HINTS) framework.


Toward Automatic Hint Generation for Logic Proof Tutoring Using Historical Student Data
The feasibility of this approach to automatically generate hints for an intelligent tutor that learns is demonstrated by extracting MDPs from four semesters of student solutions in a logic proof tutor, and the probability that they will be able to generate hints at any point in a given problem is calculated.
Autonomously Generating Hints by Inferring Problem Solving Policies
This paper autonomously generate hints for the `Hour of Code,' (which is to the best of the authors' knowledge the largest online course to date) using historical student data, and discovers that this statistic is highly predictive of a student's future success.
Experimental Evaluation of Automatic Hint Generation for a Logic Tutor
This work augmented the Deep Thought logic tutor with a Hint Factory that generates data-driven, context-specific hints for an existing computer aided instructional tool, and shows that hints help students persist in a deductive logic proofs tutor.
Generating Hints for Programming Problems Using Intermediate Output
In the context of the educational programming game, BOTS, it is found that worldstates require less prior data to generate hints in a majority of cases, without sacrificing quality or interpretability.
AutoStyle: Toward Coding Style Feedback At Scale
It is hypothesized that with a large enough (MOOC-sized) corpus of submissions to a given program-ming problem, a range of stylistic mastery from naive to expert, and many points in between, can be observed, and that this continuum can be exploited to automatically provide hints to learners to improve their code style based on the key stylistic differences between a given learner's submission and one that is stylistically slightly better.
Building Games to Learn from Their Players: Generating Hints in a Serious Game
A novel approach to modeling student states for open-ended problems, like programming in BOTS, is introduced, potentially generalizable to programming tutors for mainstream languages.
Example-based feedback provision using structured solution spaces
The quantitative evidence suggests that the proposed feedback strategies and automatic example assignment are viable in principle, further user studies in large-scale learning environments being the subject of future research.
A Response Time Model For Bottom-Out Hints as Worked Examples
It is shown that this model not only predicts learning, but captures behaviors related to self-explanation from bad student use of bottom-out hints by means of logged response times.
A Canonicalizing Model for Building Programming Tutors
This work has constructed a language-independent canonicalized model for programming solutions that allows for much greater overlap across different students than a basic text model, which enables more self-sustaining hint generation methods in programming tutors.
Automated Student Model Improvement
This work presents a technique for automated improvement of student models that leverages the DataShop repository, crowd sourcing, and a version of the Learning Factors Analysis algorithm to discover improved models based on better test-set prediction in cross validation.