• Corpus ID: 245501991

Pedagogical Word Recommendation: A novel task and dataset on personalized vocabulary acquisition for L2 learners

  title={Pedagogical Word Recommendation: A novel task and dataset on personalized vocabulary acquisition for L2 learners},
  author={Jamin Shin and Juneyoung Park},
When learning a second language (L2), one of the most important but tedious components that often demoralizes students with its ineffectiveness and inefficiency is vocabulary acquisition, or more simply put, memorizing words. In light of such, a personalized and educational vocabulary recommendation system that traces a learner’s vocabulary knowledge state would have an immense learning impact as it could resolve both issues. Therefore, in this paper, we propose and release data for a novel… 



Mining Words in the Minds of Second Language Learners: Learner-Specific Word Difficulty

This work investigated theoretically and practically important models for predicting second language learners’ vocabulary and proposed another model that achieved an accuracy competitive with the current models and defined a measure for how learner-specific a word is.

Building an English Vocabulary Knowledge Dataset of Japanese English-as-a-Second-Language Learners Using Crowdsourcing

  • Yo Ehara
  • Linguistics, Computer Science
  • 2018
A freely available dataset for analyzing the English vocabulary of English-as-a-second language (ESL) learners, which contains the results of the vocabulary size test, a well-studied English vocabulary test, by one hundred test takers hired via crowdsourcing.

Formalizing Word Sampling for Vocabulary Prediction as Graph-based Active Learning

This study proposes a novel framework for a graph-based non-interactive active learning method that can support additional functionality such as incorporating domain specificity and sampling from multiple corpora and shows that its extended methods outperform other methods in terms of vocabulary prediction accuracy when the number of samples is small.

Personalized Text Retrieval for Learners of Chinese as a Foreign Language

A personalized text retrieval algorithm that helps language learners select the most suitable reading material in terms of vocabulary complexity that is effective in identifying simpler texts for low-proficiency learners, and more challenging ones for high-prof proficiency learners is described.

Personalizing Lexical Simplification

Experimental results show that even a simple personalized CWI model, based on graded vocabulary lists, can help the lexical simplification system avoid some unnecessary simplifications and produce more readable output.

EdNet: A Large-Scale Hierarchical Dataset in Education

EdNet is introduced, a large-scale hierarchical dataset of diverse student activities collected by Santa, a multi-platform self-study solution equipped with an artificial intelligence tutoring system, making it the largest public IES dataset released to date.

Context-Aware Attentive Knowledge Tracing

Attentive knowledge tracing is proposed, which couples flexible attention-based neural network models with a series of novel, interpretable model components inspired by cognitive and psychometric models and exhibits excellent interpretability and thus has potential for automated feedback and personalization in real-world educational settings.

Automatic Discovery of Cognitive Skills to Improve the Prediction of Student Learning

A technique that uses student performance data to automatically discover the skills needed in a discipline and incorporates a nonparametric prior over the exercise-skill assignments that is based on the expert-provided skills and a weighted Chinese restaurant process is proposed.

A Trainable Spaced Repetition Model for Language Learning

HLR combines psycholinguistic theory with modern machine learning techniques, indirectly estimating the “half-life” of a word or concept in a student’s long-term memory, and was able to improve Duolingo daily student engagement by 12% in an operational user study.

Introducing Problem Schema with Hierarchical Exercise Graph for Knowledge Tracing

A hierarchical graph knowledge tracing model called HGKT is proposed to explore the latent complex relations between exercises and introduces the concept of problem schema to construct a hierarchical exercise graph that could model the exercise learning dependencies.