Learn More
Using N-gram technique without stemming is not appropriate in the context of Arabic Text Classification. For this, we introduce a new stemming technique, which we call “approximate-stemming”, based on the use of Arabic patterns. These are modeled using transducers and stemming is done without depending on any dictionary. This stemmer will be(More)
Automatic recognition of pavement deteriorations from a digital image is a difficult problem; the images classification problem has been widely studied, particularly in the medical imaging field. Compared to this, our problem adds two main difficulties. The first one is the characterization of different types of deteriorations, and the second is the removal(More)
In this paper, we address the problems of Arabic Text Classification and stemming using Transducers and Rational Kernels. We introduce a new stemming technique based on the use of Arabic patterns (Pattern Based Stemmer). Patterns are modelled using transducers and stemming is done without depending on any dictionary. Using transducers for stemming,(More)
Kernel methods have known huge success in machine learning. This success is mainly due to their flexibility to deal with high dimensionality of the feature space of complex data such as graphs, trees or textual data. In the field of text classification (TC) their performances have supplanted traditional algorithms. For textual data, different kernels were(More)
This paper proposes a novel Clustering approach for XML documents that combines both their content and structure information using tree structural-content summaries in order to reduce the size of the document. This reduction has twofold purpose. First, it reduces the size of the XML tree by eliminating redundant nodes. Second, it gathers similaire content.(More)
We address the problem of minimizing tree automata especially its incremental version. Unlike the classical minimization, incremental version [1] computes equivalences between states in the safe way, like that the algorithm may be halted at any moment, returning a partially minimized tree automata. However, this incremental version has worse time complexity(More)
Sequence kernels are widely used for learning from sequential data. The literature includes a variety of sequence kernels. In this paper, we present a general framework to deal with sequence kernels, termed weighted automata sequence kernel. In fact, the mapping of a string s to a high dimensional feature space can be modeled by a formal power series that(More)
  • 1