A mixed discrete-continuous attribute list representation for large scale classification domains

@article{Bacardit2009AMD,
  title={A mixed discrete-continuous attribute list representation for large scale classification domains},
  author={Jaume Bacardit and Natalio Krasnogor},
  journal={Proceedings of the 11th Annual conference on Genetic and evolutionary computation},
  year={2009}
}
  • J. Bacardit, N. Krasnogor
  • Published 8 July 2009
  • Computer Science
  • Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Datasets with a large number of attributes are a difficult challenge for evolutionary learning techniques. [...] Key Method Secondly, we benchmark the new representation with a diverse set of large-scale datasets and, third, we compare the new algorithms with several well-known machine learning methods. The experimental results we describe in the paper show that the new representation is equal or better than the state of-the-art in evolutionary rule representations both in terms of the accuracy obtained with…Expand
An Extended Michigan-Style Learning Classifier System for Flexible Supervised Learning, Classification, and Data Mining
TLDR
This work introduces ExSTraCS (Extended Supervised Tracking and Classifying System), as a promising platform to address the challenges of modeling complex patterns of association, systems biology, and ‘big data’ challenges using supervised learning and a Michigan-Style LCS architecture.
ExSTraCS 2.0: description and evaluation of a scalable learning classifier system
TLDR
Performance over a complex spectrum of simulated genetic datasets demonstrated that these new mechanisms dramatically improve nearly every performance metric on datasets with 20 attributes and made it possible for ExSTraCS to reliably scale up to perform on related 200 and 2000-attribute datasets.
Automatic Tuning of Rule-Based Evolutionary Machine Learning via Problem Structure Identification
TLDR
This work presents a parameter setting mechanism for a rule-based evolutionary machine learning system that is capable of finding the adequate parameter value for a wide variety of synthetic classification problems with binary attributes and with/without added noise.
Integrating memetic search into the BioHEL evolutionary learning system for large-scale datasets
TLDR
This paper adapts memetic operators for discrete representations that use information from the supervised learning process to heuristically edit classification rules and rule sets to BioHEL, a different evolutionary learning system applying the iterative learning approach, and proposes versions of these operators designed for continuous attributes and for dealing with noise.
Principled design of evolutionary learning systems for large scale data mining
TLDR
The objective of this thesis is improving the efficiency of the Bioinformatic Hierarchical Evolutionary Learning system, a system designed with the purpose of handling large domains, a classifier system that uses an Iterative Rule Learning approach to generate a set of rules one by one using consecutive Genetic Algorithms.
Rule-based machine learning classification and knowledge discovery for complex problems
TLDR
ExSTraCS is an Extended Supervised Tracking and Classifying System based on the Michigan-Style LCS architecture that offers an accessible, user friendly LCS platform for supervised rule-based machine learning, classification, data mining, prediction, and knowledge discovery.
Analysing bioHEL using challenging boolean functions
TLDR
The results show that BioHEL is highly sensitive to the choice of coverage breakpoint (as was expected) and that using a suitable (known beforehand) default class allows the system to learn faster than using a majority class policy.
Analysing BioHEL using challenging boolean functions
TLDR
The results show that BioHEL is highly sensitive to the choice of coverage breakpoint and that using a default class suitable for the problem allows the system to learn faster than using other default class policies (e.g. the majority class policy).
Sparse, guided feature connections in an Abstract Deep Network
TLDR
The ADN system provides a method for developing a very sparse, deep feature topology, based on observed relationships between features, that is able to find solutions in irregular domains, and initialize a network prior to gradient descent learning.
Generic approaches for parallel rule matching in learning classifier systems
TLDR
This work presents a generic parallel approach for matching that only relies on the minimum assumption of a multicore CPU with standard synchronization mechanisms, e.g., locks.
...
1
2
3
...

References

SHOWING 1-10 OF 33 REFERENCES
Improving the scalability of rule-based evolutionary learning
TLDR
A new representation motivated by observations that Bioinformatics and Systems Biology often give rise to very large-scale datasets that are noisy, ambiguous and usually described by a large number of attributes is presented, which is up to 2–3 times faster than state-of-the-art evolutionary learning representations designed specifically for efficiency purposes.
SIA: A Supervised Inductive Algorithm with Genetic Search for Learning Attributes based Concepts
TLDR
SIA algorithm is somewhat similar to the AQ algorithm because it takes an example as a seed and generalizes it, using a genetic process, to find a rule maximizing a noise tolerant rule evaluation criterion.
An Introduction to Variable and Feature Selection
TLDR
The contributions of this special issue cover a wide range of aspects of variable selection: providing a better definition of the objective function, feature construction, feature ranking, multivariate feature selection, efficient search methods, and feature validity assessment methods.
Performance and Efficiency of Memetic Pittsburgh Learning Classifier Systems
TLDR
Several local search mechanisms that heuristically edit classification rules and rule sets to improve their performance are empirically evaluated, identifying which combination of operators and policies scale well, are robust to noise, generate compact solutions, and use the least amount of computational resources to solve the problems.
An analysis of matching in learning classifier systems
TLDR
The results on typical test problems show that the specificity-based representation can halve the time required for matching but also that binary encoding is about ten times faster on the most difficult problems.
A Method for Handling Numerical Attributes in GA-Based Inductive Concept Learners
TLDR
Results of experiments on various data sets indicate that the method provides an effective local discretization tool for GA based inductive concept learners.
Statistical Comparisons of Classifiers over Multiple Data Sets
  • J. Demšar
  • Computer Science
    J. Mach. Learn. Res.
  • 2006
TLDR
A set of simple, yet safe and robust non-parametric tests for statistical comparisons of classifiers is recommended: the Wilcoxon signed ranks test for comparison of two classifiers and the Friedman test with the corresponding post-hoc tests for comparisons of more classifiers over multiple data sets.
Automated alphabet reduction method with evolutionary algorithms for protein structure prediction
TLDR
The results show that it is possible to reduce the size of the alphabet used for prediction from Twenty-letter amino acid alphabet to just three letters resulting in more compact, i.e. interpretable, rules.
Data Mining and Knowledge Discovery with Evolutionary Algorithms
TLDR
This book integrates two areas of computer science, namely data mining and evolutionary algorithms, and emphasizes the importance of discovering comprehensible, interesting knowledge, which is potentially useful for intelligent decision making.
Prediction of recursive convex hull class assignments for protein residues
TLDR
It is shown that residue RCH class contains information complementary to widely studied measures such as solvent accessibility, residue depth and to the distance of residues from the centroid of the chain, the residues' exposure (Exp).
...
1
2
3
4
...