Farasa: A New Fast and Accurate Arabic Word Segmenter


In this paper, we present Farasa (meaning insight in Arabic), which is a fast and accurate Arabic segmenter. Segmentation involves breaking Arabic words into their constituent clitics. Our approach is based on SVM using linear kernels. The features that we utilized account for: likelihood of stems, prefixes, suffixes, and their combination; presence in… (More)


3 Figures and Tables


Citations per Year

Citation Velocity: 28

Averaging 28 citations per year over the last 2 years.

Learn more about how we calculate this metric in our FAQ.

Slides referencing similar topics