Mohammed Benkhalifa

Learn More
Text Categorization (TC) is the automated assignment of text documents to predefined categories based on document contents. For the past few years, TC has become very important essentially in the Information Retrieval area, where information needs have tremendously increased with the rapid growth of textual information sources such as the Internet. In this(More)
Text Categorization (TC) is the automated assignment of text documents to predefined categories based on document contents. TC has been an application for many learning approaches, which prove effective. Nevertheless, TC provides many challenges to machine learning. In this paper, we suggest, for text categorization, the integration of external WordNet(More)
Mohammed Benkhalifa,1 * Abdelhak Mouradi,2 † Houssaine Bouyakhf3 ‡ 1 School of Science and Engineering, Al Akhawayn University in Ifrane (AUI), P.O. Box 1828, Av. Hassan II, Ifrane 53000, Morocco 2 Ecole Nationale Superieure d’Informatique et d’Analyses des Systémes (ENSIAS), Mohammed V University, P.O. Box 713, Agdal Rabat, Morocco 3 Computer Science(More)
Nowadays, the web is the most relevant data source. Its size does not stop growing day by day. Web page classification becomes crucial due to this overwhelming amount of data. Web pages contain many noisy contents that bias textual classifiers and lead them to lose focus on their main subject. Web pages are related to each other either implicitly by users'(More)
Support Vector Machine (SVM) is a powerful classifier used widely in textual and web classification. It tries to find an hyperplane that separates positive and negative data, maximizes the margin. SVM is a classifier that is based on a kernel whose choice is very critical. We propose in this paper an implicit links based Gaussian kernel that uses an(More)
With the rapid growth of the web's size, web page classification becomes more prominent. The representation way of a web page and contextual features used for this representation have both an impact on the classification's performance. Thus, finding an adequate representation of web pages is essential for a better web page classification. In this paper, we(More)
Textual document classification is one challenging area of data mining. Web page classification is a type of textual document classification. However, the text contained in web pages is not homogenous since a web page can discuss related but different subjects. Thus, results obtained by a textual classifier on web pages are not as better as those obtained(More)
Automatic web page classification plays an essential role in information retrieval, web mining and web semantics applications. Web pages have special characteristics (such as HTML tags, hyperlinks, etc....) that make their classification different from standard text categorization. Thus, when applied to web data, traditional text classifiers do not usually(More)
  • 1