Mohammad Shahid Shaikh

Learn More
—The activity of labeling of documents according to their content is known as text categorization. Many experiments have been carried out to enhance text categorization by adding background knowledge to the document using knowledge repositories like Word Net, Open Project Directory (OPD), Wikipedia and Wikitology. In our previous work, we have carried out(More)
The process of text categorization assigns labels or categories to each text document according to the semantic content of the document. The traditional approaches to text categorization used features from the text like: words, phrases, and concepts hierarchies to represent and reduce the dimensionality of the documents. Recently, researchers addressed this(More)
— A major computational burden, while performing document clustering, is the calculation of similarity measure between a pair of documents. Similarity measure is a function that assigns a real number between 0 and 1 to a pair of documents, depending upon the degree of similarity between them. A value of zero means that the documents are completely(More)
Document clustering is an unsupervised approach in which a large collection of documents (corpus) is subdivided into smaller, meaningful, identifiable, and verifiable subgroups (clusters). Meaningful representation of documents and implicitly identifying the patterns, on which this separation is performed, is the challenging part of document clustering. We(More)
  • 1