Helen Yannakoudakis

Learn More
We demonstrate how supervised discrimina-tive machine learning techniques can be used to automate the assessment of 'English as a Second or Other Language' (ESOL) examination scripts. In particular, we use rank preference learning to explicitly model the grade relationships between scripts. A number of different features are extracted and ablation tests are(More)
This paper describes our submission to the CoNLL 2014 shared task on grammatical error correction using a hybrid approach, which includes both a rule-based and an SMT system augmented by a large web-based language model. Furthermore, we demonstrate that correction type estimation can be used to remove unnecessary corrections, improving precision without(More)
Automated feedback on writing may be a useful complement to teacher comments in the process of learning a foreign language. This paper presents a self-assessment and tutoring system which combines an holistic score with detection and correction of frequent errors and furthermore provides a qualitative assessment of each individual sentence, thus making the(More)
We demonstrate how data-driven approaches to learner corpora can support Second Language Acquisition research when integrated with visualisation tools. We present a visual user interface supporting the investigation of a set of linguistic features discriminating between pass and fail 'English as a Second or Other Lan-guage' exam scripts. The system displays(More)
Automated Text Scoring (ATS) provides a cost-effective and consistent alternative to human marking. However, in order to achieve good performance, the pre-dictive features of the system need to be manually engineered by human experts. We introduce a model that forms word representations by learning the extent to which specific words contribute to the text's(More)
Various measures have been used to evaluate the effectiveness of automated text scoring (ATS) systems with respect to a human gold standard. However, there is no systematic study comparing the efficacy of these met-rics under different experimental conditions. In this paper we first argue that measures of agreement are more appropriate than measures of(More)
In this thesis, we investigate automated assessment (AA) systems of free text that automatically analyse and score the quality of writing of learners of English as a second (or other) language. Previous research has employed techniques that measure, in addition to writing competence, the semantic relevance of a text written in response to a given prompt. We(More)
We demonstrate how data-driven approaches to learner corpora can support Second Language Acquisition research when integrated with visualisation tools. We employ a visual user interface supporting the investigation of a set of automatically determined features discriminating between pass and fail First Certificate in English (FCE) exam scripts. We(More)
This thesis addresses the task of error detection in the choice of content words focusing on adjective–noun and verb–object combinations. We show that error detection in content words is an under-explored area in research on learner language since (i) most previous approaches to error detection and correction have focused on other error types, and (ii) the(More)