Web Mining: a Roadmap

Abstract

The World Wide Web, has grown in the past few years from a small research community to the biggest and most popular way of communication and information dissemination. Every day, the WWW grows by roughly a million electronic pages, adding to the hundreds of millions already on-line. WWW serves as a platform for exchanging various kinds of information, ranging from research papers, and educational content, to multimedia content and software. The continuous growth in the size and the use of the WWW imposes new methods for processing these huge amounts of data. Because of its rapid and chaotic growth, the resulting network of information lacks of organization and structure. Moreover, the content is published in various diverse formats. Due to this fact, users are feeling sometimes disoriented, lost in that information overload that continues to expand. Issues that have to be dealt with are the detection of relevant information, involving the searching and indexing of the Web content, the creation of some metaknowledge out of the information which is available on the Web, as well as the addressing of the individual users’ needs and interests, by personalizing the provided information and services. Web mining is a very broad research area emerging to solve the issues that arise due to the WWW phenomenon. The Web mining research is a converging research area from several research communities, such as Databases, IR and AI. In this work we will try to overview the most important issues of each one of the three axes of Web mining, namely Web structure, Web content and Web usage mining. We also try to make a prediction concerning the future of Web mining, which is the combination of the methods used in all three categories of Web mining, towards the Semantic Web vision.

3 Figures and Tables

Cite this paper

@inproceedings{Eirinaki2007WebMA, title={Web Mining: a Roadmap}, author={Magdalini Eirinaki}, year={2007} }