Michael Flor

Learn More
We describe a new representation of the content vocabulary of a text we call word association profile that captures the proportions of highly associated, mildly associated , unassociated, and dis-associated pairs of words that co-exist in the given text. We illustrate the shape of the dis-tirbution and observe variation with genre and target audience. We(More)
In this paper we present a new spell-checking system that utilizes contextual information for automatic correction of non-word misspel-lings. The system is evaluated with a large corpus of essays written by native and non-native speakers of English to the writing prompts of high-stakes standardized tests (TOEFL ® and GRE ®). We also present comparative(More)
Developments in the educational landscape have spurred greater interest in the problem of automatically scoring short answer questions. A recent shared task on this topic revealed a fundamental divide in the mod-eling approaches that have been applied to this problem, with the best-performing systems split between those that employ a knowledge engineering(More)
Many existing approaches for measuring text complexity tend to overestimate the complexity levels of informational texts while simultaneously underestimating the complexity levels of literary texts. We present a two-stage estimation technique that successfully addresses this problem. At Stage 1, each text is classified into one or another of three possible(More)
In this paper, we address the problem of quantifying the overall extent to which a test-taker's essay deals with the topic it is assigned (prompt). We experiment with a number of models for word topicality, and a number of approaches for aggregating word-level indices into text-level ones. All models are evaluated for their ability to predict the holistic(More)