This paper focuses on the development of a maintainable information filtering system. The simple and efficient solution to this problem is to block the Web sites by URL, including IP address. However, it is not efficient for unknown Web sites and it is difficult to obtain complete block list. Content based filtering is suggested to overcome this problem as(More)
Automated help desk systems should retrieve exactly the information required to assist a user as quickly and as easily as possible be it for a lay user who knows little about the domain or for an advanced user who requires more specialized information. Automated help desk systems should also be easily maintainable, as knowledge in domains where help is(More)
Two main research areas in statistical text categorization are similarity based learning algorithms and associated thresholding strategies. The combination of these techniques significantly influences the overall performance of text categorization. After investigating two similarity-based classifiers (k-NN and Rocchio) and three common thresholding(More)
These days, billions of Web pages are created with HTML or other markup languages. They only have a few uniform structures and contain various authoring styles compared to traditional text-based documents. However, users usually focus on a particular section of the page that presents the most relevant information to their interest. Therefore, Web documents(More)
Multiple Classification Ripple Down Rules (MCRDR) is a knowledge acquisition technique that produces representations, or knowledge maps, of a human expert's knowledge of a particular domain. However, work on gaining an understanding of the knowledge acquired at a deeper meta-level or using the knowledge to derive new information is still in its infancy.(More)
Knowledge Discovery techniques find new knowledge about a domain by analysing existing domain knowledge and examples of domain data. These techniques typically involve using a human expert and automated software analysis (Data Mining). Often the human expertise is used initially to choose which data is processed, and then finally to determine which results(More)