• Corpus ID: 4896510

Learning to Simplify Sentences Using Wikipedia

  title={Learning to Simplify Sentences Using Wikipedia},
  author={William Coster and David Kauchak},
In this paper we examine the sentence simplification problem as an English-to-English translation problem, utilizing a corpus of 137K aligned sentence pairs extracted by aligning English Wikipedia and Simple English Wikipedia. This data set contains the full range of transformation operations including rewording, reordering, insertion and deletion. We introduce a new translation model for text simplification that extends a phrase-based machine translation approach to include phrasal deletion… 

