Mohamed Outahajala

Learn More
The aim of this paper is to present the first Amazighe POS tagger. Very few linguistic resources have been developed so far for Amazighe and we believe that the development of a POS tagger tool is the first step needed for automatic text processing. The used data have been manually collected and annotated. We have used state-of-art supervised machine(More)
Over the last few years, Moroccan society has known a lot of debate about the Amazigh language and culture. The creation of a new governmental institution, namely IRCAM, has made it possible for the Amazigh language and culture to reclaim their rightful place in many domains. Taking into consideration the situation of the Amazigh language which needs more(More)
Like most of the languages which have only recently started being investigated for the Natural Language Processing (NLP) tasks, Amazigh lacks annotated corpora and tools and still suffers from the scarcity of linguistic tools and resources. The main aim of this paper is to present a tokenizer tool and a new part-of-speech (POS) tagger based on a new Amazigh(More)
Amazigh is used by tens of millions of people mainly for oral communication. However, and like all the newly investigated languages in natural language processing, i t i s resource-scarce. The main aim of this paper is to present o u r POS taggers results based on two state of the art sequence labeling techniques, namely Conditional Random Fields and(More)
The main goal of this work is the implementation of a new tool for the Amazigh part of speech tagging using Markov Models and decision trees. After studying different approaches and problems of part of speech tagging, we have implemented a tagging system based on TreeTagger a generic stochastic tagging tool, very popular for its efficiency. We have gathered(More)
The aim of this paper is to present the first Amazighe POS tagger. Very few linguistic resources have been developed so far for Amazighe and we believe that the development of a POS tagger tool is the first step needed for automatic text processing. In order to achieve this endeavor, we have trained two sequence classification models using Support Vector(More)
Like most of the languages which have only recently started being investigated for the Natural Language Processing (NLP) tasks, Amazigh lacks annotated corpora and tools and still suffers from the scarcity of linguistic tools and resources. The main aim of this paper is to present a new part-of-speech (POS) tagger based on a new Amazigh tag set (AMTS)(More)
Language resources are important for those working on computational methods to analyze and study languages. These resources are needed to help advancing the research in fields such as natural language processing, machine learning, information retrieval and text analysis in general. We describe the creation of morphosyntactically annotated corpus for Amazigh(More)
  • 1