# Parallel MCNN (pMCNN) with Application to Prototype Selection on Large and Streaming Data

@article{Devi2017ParallelM, title={Parallel MCNN (pMCNN) with Application to Prototype Selection on Large and Streaming Data}, author={V. Susheela Devi and Lakhpat Meena}, journal={Journal of Artificial Intelligence and Soft Computing Research}, year={2017}, volume={7}, pages={155 - 169} }

Abstract The Modified Condensed Nearest Neighbour (MCNN) algorithm for prototype selection is order-independent, unlike the Condensed Nearest Neighbour (CNN) algorithm. Though MCNN gives better performance, the time requirement is much higher than for CNN. To mitigate this, we propose a distributed approach called Parallel MCNN (pMCNN) which cuts down the time drastically while maintaining good performance. We have proposed two incremental algorithms using MCNN to carry out prototype selectionâ€¦Â

## Figures and Tables from this paper

## 18 Citations

On Ensemble Components Selection in Data Streams Scenario with Gradual Concept-Drift

- Computer ScienceICAISC
- 2018

The algorithm proposed in this paper is an enhanced version of the ASE (Automatically Sized Ensemble) algorithm which guarantees that a new component will be added to the ensemble only if it increases the accuracy not only for the current data chunk but also for the whole data stream.

An Instance Selection Algorithm Based on ReliefF

- Computer ScienceInt. J. Artif. Intell. Tools
- 2019

A new instance selection algorithm based on ReliefF, which is a feature selection algorithm, which can reduce data at a specified rate and have the ability to run parallel on the instances is proposed.

On Handling Missing Values in Data Stream Mining Algorithms Based on the Restricted Boltzmann Machine

- Computer ScienceICONIP
- 2019

This paper proposes two modifications of the RBM learning algorithms to make them able to handle missing values, and introduces dimension-dependent sizes of minibatches in the stochastic gradient descent method.

On the Parzen Kernel-Based Probability Density Function Learning Procedures Over Time-Varying Streaming Data With Applications to Pattern Classification

- Computer Science, MathematicsIEEE Transactions on Cybernetics
- 2020

A recursive variant of the Parzen kernel density estimator (KDE) is proposed to track changes of dynamic density over data streams in a nonstationary environment and it is shown how to choose the bandwidth and learning rate of a recursive KDE in order to ensure weak and strong convergence.

A New Approach to Detection of Changes in Multidimensional Patterns

- Computer ScienceJ. Artif. Intell. Soft Comput. Res.
- 2020

A new approach for abrupt changes detection based on the Parzen kernel estimation of the partial derivatives of the multivariate regression functions in presence of probabilistic noise is proposed.

On the Global Convergence of the Parzen-Based Generalized Regression Neural Networks Applied to Streaming Data

- MathematicsICAISC
- 2018

The mean integrated squared error of the regression estimate is shown to converge under several conditions and results illustrate asymptotic properties of the Parzen-type recursive algorithm and its convergence for a wide spectrum of a time-varying noise.

Estimation of Probability Density Function, Differential Entropy and Other Relative Quantities for Data Streams with Concept Drift

- Mathematics, Computer ScienceICAISC
- 2018

Estimators of the Cauchy-Schwarz divergence and the probability density function divergence are proposed, which are used to measure the differences between two probability density functions.

Parallel Processing of Color Digital Images for Linguistic Description of Their Content

- Computer SciencePPAM
- 2017

This paper presents different aspects of parallelization of a problem of processing color digital images in order to generate linguistic description of their content. A parallel architecture of anâ€¦

On the Hermite Series-Based Generalized Regression Neural Networks for Stream Data Mining

- Mathematics, Computer ScienceICONIP
- 2019

The mathematically justified stream data mining algorithm for solving regression problems is developed, based on the Hermite expansions of drifting regression functions, and the global convergence is proved both in probability and with probability one.

Parallel Processing of Images Represented by Linguistic Description in Databases

- Computer SciencePPAM
- 2019

The problem of image retrieval and classification is presented by use of the linguistic description represented in databases, and the rough granulation, by using the CIE chromaticity color model and granulation approach.

## References

SHOWING 1-10 OF 43 REFERENCES

Distributed Nearest Neighbor-Based Condensation of Very Large Data Sets

- Computer ScienceIEEE Transactions on Knowledge and Data Engineering
- 2007

This work presents the parallel fast condensed nearest neighbor (PFCNN) rule, a distributed method for computing a consistent subset of a very large data set for the nearest neighbor classification rule, and is the first distributed algorithm for Computing a training set consistent subset for the closest neighbor rule.

Fast condensed nearest neighbor rule

- Computer ScienceICML
- 2005

This work presents a novel algorithm for computing a training set consistent subset for the nearest neighbor decision rule, and compares it with state of the art competence preservation algorithms on large multidimensional training sets, showing that it outperforms existing methods in terms of learning speed and learning scaling behavior.

Efficient instance-based learning on data streams

- Computer ScienceIntell. Data Anal.
- 2007

This paper considers the problem of classification on data streams and develops an instance-based learning algorithm for that purpose and suggests that this algorithm has a number of desirable properties that are not, at least not as a whole, shared by currently existing alternatives.

Prototype Selection Algorithms for kNN Classifier: A Survey

- Computer Science
- 2014

This paper provides a survey of the prototype selection methodâ€™s categorization/taxonomy that could be considered relevant and different properties could be observed in the definition of these methods, but no formal categorization has been established yet.

The Condensed Nearest Neighbor Rule

- Computer Science
- 1967

The CNN rule is suggested as a rule which retains the basic approach of the NN rule without imposing such stringent storage requirements, and the notion of a consistent subset of a sample set is defined.

High performance parallel evolutionary algorithm model based on MapReduce framework

- Computer ScienceInt. J. Comput. Appl. Technol.
- 2013

In order to justify the effectiveness of the MR-PEA model, a parallel gene expression programming based on MapReduce MR-GEP used to solve symbolic regression is proposed.

A Nearest Prototype Selection Algorithm Using Multi-objective Optimization and Partition

- Computer Science2013 Ninth International Conference on Computational Intelligence and Security
- 2013

The simulation results indicate that the proposed algorithm can obtain smaller reduction ratio and higher classification efficiency, or at least comparable to those of some existing compared algorithms, which illustrates thatThe proposed algorithm is an expedient method in design nearest neighbor classifiers.

An Adaptive Nearest Neighbor Classification Algorithm for Data Streams

- Computer SciencePKDD
- 2005

Tests performed on both synthetic and real-life data indicate that the new classifier outperforms existing algorithms for data streams in terms of accuracy and computational costs.

A scalable parallel implementation of evolutionary algorithms for multi-objective optimization on GPUs

- Computer Science2015 IEEE Congress on Evolutionary Computation (CEC)
- 2015

This paper proposes a parallel GPU based implementation of NSGA-II with major focus on non-dominated sorting and can be easily coupled with the original form of NSga-II to solve real world problems using large populations.