Rajendra Kumar Roul

Learn More
The dynamic web has increased exponentially over the past few years with more than thousands of documents related to a subject available to the user now. Most of the web documents are unstructured and not in an organized manner and hence user facing more difficult to find relevant documents. A more useful and efficient mechanism is combining clustering with(More)
The Traditional apriori algorithm can be used for clustering the web documents based on the association technique of data mining. But this algorithm has several limitations due to repeated database scans and its weak association rule analysis. In modern world of large databases, efficiency of traditional apriori algorithm would reduce manifolds. In this(More)
Search engine returns thousands of web pages for a single user query, in which most of them are not relevant. In this context, effective information retrieval from the expanding web is a challenging task, in particular, if the query is ambiguous. The major question arises here is that how to get the relevant pages for an ambiguous query. We propose an(More)
The World Wide Web serves as a huge repository of information that is highly dynamic, diverse and growing at an exponential rate in a lightening speed. In order to speed-up and further improve tasks like information search and retrieval, personalization etc; it is highly important to develop techniques to classify text documents more accurately and(More)
Exponential growth of the web increased the importance of web document classification and data mining. To get the exact information, in the form of knowing what classes a web document belongs to, is expensive. Automatic classification of web document is of great use to search engines which provides this information at a low cost. In this paper, we propose(More)
The size of web has increased exponentially over the past few years with thousands of documents related to a subject available to the user. With this much amount of information available, it is not possible to take the full advantage of the World Wide Web without having a proper framework to search through the available data. This requisite organization can(More)
The aim of text classification is to classify the text documents into a set of pre-defined categories. But the complexity of natural languages, high dimensional feature space and low quality of feature selection become the main problem for text classification process. Hence, in order strengthen the classification technique, selection of important features,(More)
The number of digital documents, which are a collection of a huge volume of features on the Web, is increasing day-by-day. Hence, selection of important features relevant to the classification process, and consequently discarding irrelevant ones, is the need of the hour. Aiming in this direction, this paper highlights two important aspects of Information(More)