• Publications
  • Influence
KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework
TLDR
The aim of this paper is to present three new aspects of KEEL: KEEL-dataset, a data set repository which includes the data set partitions in theKEELformatandshowssomeresultsofalgorithmsinthesedatasets; some guidelines for including new algorithms in KEEL, helping the researcherstomaketheirmethodseasilyaccessibletootherauthorsandtocompare theresults of many approaches already included within the KEEL software. Expand
  • 1,599
  • 100
  • PDF
Genetics-Based Machine Learning for Rule Induction: State of the Art, Taxonomy, and Comparative Study
TLDR
The classification problem can be addressed by numerous techniques and algorithms which belong to different paradigms of machine learning. Expand
  • 140
  • 11
  • PDF
Big data preprocessing: methods and prospects
TLDR
The massive growth in the scale of data has been observed in recent years being a key factor of the Big Data scenario. Expand
  • 138
  • 4
  • PDF
Towards Highly Accurate Coral Texture Images Classification Using Deep Convolutional Neural Networks and Data Augmentation
TLDR
The recognition of coral species based on underwater texture images pose a significant difficulty for machine learning algorithms, due to the three following challenges embedded in the nature of this data: 1) datasets do not include information about the global structure of the coral; 2) several species of coral have very similar characteristics. Expand
  • 39
  • 2
  • PDF
Enabling Smart Data: Noise filtering in Big Data classification
TLDR
In any knowledge discovery process the value of extracted knowledge is directly related to the quality of the data used. Expand
  • 52
  • 1
  • PDF
COVIDGR Dataset and COVID-SDNet Methodology for Predicting COVID-19 Based on Chest X-Ray Images
TLDR
We proposed a methodology, called COVID-SDNet, that combines segmentation, data-augmentation and data transformation to improve the generalization capacity of COVID classification models. Expand
  • 7
  • 1
  • PDF
DILS: Constrained clustering through dual iterative local search
TLDR
We propose a new metaheuristic algorithm, the Dual Iterative Local Search, and prove its ability to produce quality results for the constrained clustering problem. Expand
  • 1
  • 1
Transforming big data into smart data: An insight on the use of the k‐nearest neighbors algorithm to obtain quality data
TLDR
We present the emerging big data‐ready versions of these algorithms and develop some new methods to cope with Big Data. Expand
  • 27
  • PDF
A Study on the Noise Label Influence in Boosting Algorithms: AdaBoost, GBM and XGBoost
TLDR
In classification, class noise alludes to incorrect labelling of instances and it causes the classifiers to perform worse. Expand
  • 13
Implementation and Integration of Algorithms into the KEEL Data-Mining Software Tool
TLDR
This work is related to the KEEL (Knowledge Extraction based on Evolutionary Learning) tool, a non-commercial software that supports data management, design of experiments and an educational section. Expand
  • 8
  • PDF