• Corpus ID: 30627795

Semantically Document Clustering Using Contextual Similarities

  title={Semantically Document Clustering Using Contextual Similarities},
  author={R. Nagaraj and X. Agnise Kalarani},
Efficient Document clustering can be performed based on the term level, sentence level and concept level techniques in the high dimensional document space. Most of the existing techniques have problems such as two-variable problem, high computational time and low similarity relatedness which reduces the clustering efficiency. To overcome the existing drawbacks, a hybrid clustering algorithm called Semantically Document Clustering algorithm is proposed in this paper. The Semantically Document… 
2 Citations

Figures from this paper

Contextual Abstraction Based Clustering Technique for Effective Text Document Mining
A novel Contextual Abstraction based Document Clustering (CADC) technique is proposed to improve the performance of document clustering and information gain on clustering of multidimensional data and reduces document retrieval time in an effective manner.
Exploring the Implementation of Cloud Security to Minimize Electronic Health Records Cyberattacks
This qualitative case study was to explore strategies senior information technology leaders in the healthcare industry use to implement cloud security to minimize electronic health record cyberattacks.


Clustering Sentence-Level Text Using a Novel Fuzzy Relational Clustering Algorithm
Results of applying the algorithm to sentence clustering tasks demonstrate that the algorithm is capable of identifying overlapping clusters of semantically related sentences, and that it is therefore of potential use in a variety of text mining tasks.
An integration of fuzzy association rules and WordNet for document clustering
The experimental results show that the proposed Fuzzy Frequent Itemset-based Document Clustering (F2IDC) approach indeed provide more accurate clustering results than prior influential clustering methods presented in recent literature.
Document Clustering in Correlation Similarity Measure Space
A new spectral clustering method called correlation preserving indexing (CPI), which is performed in the correlation similarity measure space, which can effectively discover the intrinsic structures embedded in high-dimensional document space.
A multi-layer text classification framework based on two-level representation model
Dirichlet Process Mixture Model for Document Clustering with Feature Partition
This paper proposes a novel approach, namely DPMFP, to discover the latent cluster structure based on the DPM model without requiring the number of clusters as input, and investigates a variational inference algorithm to infer the document collection structure and the partition of document words at the same time.
Large Scale Spectral Clustering with Landmark-Based Representation
This paper proposes a novel approach, called Landmark-based Spectral Clustering (LSC), for large scale clustering problems, where p representative data points are selected as the landmarks and the spectral embedding of the data can be efficiently computed with the landmark-based representation.
Document Representation with Statistical Word Senses in Cross-Lingual Document Clustering
Evaluation on the benchmarking datasets shows that the proposed models outperform two state-of-the-art methods for cross-lingual document clustering.
Comparative Analysis of K-Means and Fuzzy C- Means Algorithms
Two important clustering algorithms namely centroid based K-means and representative object based FCM (Fuzzy C-Means) clustering algorithm are compared and performance is evaluated on the basis of the efficiency of clustering output.