Rough Set Based Clustering Using Active Learning Approach

  title={Rough Set Based Clustering Using Active Learning Approach},
  author={Rekha Kandwal and Prerna Mahajan and Ritu Vijay},
  journal={Int. J. Artif. Life Res.},
This paper revisits the problem of active learning and decision making when the cost of labeling incurs cost and unlabeled data is available in abundance. In many real world applications large amounts of data are available but the cost of correctly labeling it prohibits its use. In such cases, active learning can be employed. In this paper the authors propose rough set based clustering using active learning approach. The authors extend the basic notion of Hamming distance to propose a… 

Figures from this paper

Rough set based clustering in dense web domain

The clustering task for sequence data (web page visits) is demonstrated in three ways namely, capturing content information, sequence information and combination of both, suggesting that the measure which captures both content and sequence forms compact clusters, thus putting the web users of similar interests in one group.

An alternative approach for clustering web user sessions considering sequential information

The Sequence and Set Similarity Measure S^{3}M with rough set based similarity upper approximation clustering algorithm to group web users based on their navigational patterns to show the viability of this approach.

Linking Interactome to Disease: A Network-Based Analysis of Metastatic Relapse in Breast Cancer

To find a predictive signature generalizable for multiple datasets, a strategy of superimposition of a large scale of proteinprotein interaction data (human interactome) was devised over several gene expression datasets, to find discriminative regions in the interactome (subnetworks) predicting metastatic relapse in breast cancer.

Mapping with Monocular Vision in Two Dimensions

This article shows how maps that are correctly up to scale can be built without knowledge of the camera intrinsic parameters or speed during uniform motion, and how performing an inverse parameterization of the image coordinates turns the mapping problem into the fitting of line segments to a group of points.

Response Curves for Cellular Automata in One and Two Dimensions: An Example of Rigorous Calculations

This paper considers the problem of computing a response curve for binary cellular automata, that is, the curve describing the dependence of thedensity of ones after many iterations of the rule on the initial density of ones, and discusses a special case of totally disordered initial configurations.

A New Approach to Pattern Recognition in Fractal Ferns

Two advanced iterations from nonlinear analysis from non linear analysis into the study of IFS for generation and pattern recognition of new fractal ferns are introduced.

FPGA Coprocessor for Simulation of Neural Networks Using Compressed Matrix Storage

The connectionist model of information processing assumes that useful behavior is an emergent property of a huge number of relatively simple interacting units, but this is not the case and simulations are the predominant tool to understand the behavior of the modeled systems.

Conspecific Emotional Cooperation Biases Population Dynamics: A Cellular Automata Approach

Simulations support the hypothesis that the acquisition of emotion may be an evolutionary result of competitive species interactions and indicate that emotions increase adaptability, help control disease, and improve survival for the species that utilizes them.

An Overview of Multimodal Interaction Techniques and Applications

Currently, multimodal interfaces have started to understand 3D hand gestures, body postures, and facial expressions, thanks to recent progress in computer vision techniques.

Generating Fully Bounded Chaotic Attractors

This paper considers a class of 2-D mappings displaying fully bounded chaotic attractors for all bifurcation parameters and describes in detail the dynamical behavior of this map, along with some other dynamical phenomena.



A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining

This paper presents an algorithm, called k-modes, to extend the k-means paradigm to categorical domains, which introduces new dissimilarity measures to deal with categorical objects, replace means of clusters with modes, and use a frequency based method to update modes in the clustering process to minimise the clustered cost function.

A Rough Set-Based Hierarchical Clustering Algorithm for Categorical Data

The categorical similarity measure based on Euclidean distance is given so as to better solve the problem of difficult measurement of categorical data because of the non-numerical data nature.

Feature Selection for Unsupervised Learning

This paper explores the feature selection problem and issues through FSSEM (Feature Subset Selection using Expectation-Maximization (EM) clustering) and through two different performance criteria for evaluating candidate feature subsets: scatter separability and maximum likelihood.

Study of a Cluster Algorithm Based on Rough Sets Theory

  • Licai YangLancang Yang
  • Computer Science
    Sixth International Conference on Intelligent Systems Design and Applications
  • 2006
Combined the method of calculating equivalence class in rough sets, an improved clustering algorithm based on k-medoids algorithm was presented and it was shown that this algorithm is effective to discover the clusters with arbitrary shape and to set the number of clusters, which is difficult for traditional clustering algorithms.

Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values

  • J. Huang
  • Computer Science
    Data Mining and Knowledge Discovery
  • 2004
Two algorithms which extend the k-means algorithm to categorical domains and domains with mixed numeric and categorical values are presented and are shown to be efficient when clustering large data sets, which is critical to data mining applications.

Rough–Fuzzy Collaborative Clustering

A novel clustering architecture is introduced, in which several subsets of patterns can be processed together with an objective of finding a common structure, and the required communication links are established at the level of cluster prototypes and partition matrices.

Data clustering: a review

An overview of pattern clustering methods from a statistical pattern recognition perspective is presented, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners.

Clustering transactions using large items

This work proposes the notion of large items, i.e., items contained in some minimum fraction of transactions in a cluster, to measure the similarity of a cluster of transactions.

A New Clustering Algorithm On Nominal Data Sets

A new clustering technique named as the Olary algorithm, which is suitable to cluster nominal data sets and provides a useful way to estimate the number of underlying clusters by the use of a new kind of diagram, called Number of Clusters versus Distance Diagram (NCDD for short).

Discretization Algorithms of Rough Sets Using Clustering

In this paper, hierarchical clustering method is introduced for attribute discretization, which can determine automatically the significant clusters and keep consistent with extracted clusters from dendrograms.