Improving Fitness Functions in Genetic Programming for Classification on Unbalanced Credit Card Datasets

  title={Improving Fitness Functions in Genetic Programming for Classification on Unbalanced Credit Card Datasets},
  author={Van Loi Cao and Nhien-An Le-Khac and Miguel Nicolau and M. A. {\'O}neill and James McDermott},
Credit card classification based on machine learning has attracted considerable interest from the research community. One of the most important tasks in this area is the ability of classifiers to handle the imbalance in credit card data. In this scenario, classifiers tend to yield poor accuracy on the minority class despite realizing high overall accuracy. This is due to the influence of the majority class on traditional training criteria. In this paper, we aim to apply genetic programming to… 
A Classification Model For Class Imbalance Dataset Using Genetic Programming
A solution which uses entropy and information gain as a fitness function in GA with an objective to improve the impurity and gives a more balanced result without changing the original dataset is proposed.
A bi-objective hybrid algorithm for the classification of imbalanced noisy and borderline data sets
An idea of using a newly proposed bi-objective hybrid algorithm based on the hybridization of two metaheuristics, namely cuckoo search and covariance matrix adaptation evolution strategy for the classification task of binary imbalanced noisy and borderline data sets is proposed.
Machine Assistance for Credit Card Approval? Random Wheel can Recommend and Explain
This work has used an enhanced version of random wheel to facilitate a trustworthy recommendation for credit card approval process and produces more accurate and precise recommendation but also provides an interpretable confidence measure.
Automated Design of Genetic Programming Classification Algorithms Using a Genetic Algorithm
The proposed automated design of genetic programming classification algorithms is proposed and the results indicate that induced classifiers perform better than manually designed classifiers.
Deep fraud. A fraud intention recognition framework in public transport context using a deep-learning approach
This paper presents a framework for fraud intention recognition of public transport bus operators based on a deep learning approach using a stack of denoising and sparse autoencoders, and compares it with another nondeep state of the art classification approaches.
Extension des Programmes Génétiques pour l'apprentissage supervisé à partir de très larges Bases de Données (Big data). (Extending Genetic Programming for supervised learning from very large datasets (Big data))
The adaptation of GP to overcome the data Volume hurdle in Big Data problems is investigated and a new sampling approach called “adaptive sampling” is formulated, based on controlling sampling frequency depending on learning process and through fixed, determinist and adaptive control schemes.
Applications of Evolutionary Algorithms to Management Problems
Evolutionary Algorithms are metaheuristics based on a rough abstraction of the mechanisms of natural evolution that attracted broader attention also outside the scientific community in the last 15–20 years.
A novel procedure combining computational fluid dynamics and evolutionary approach to minimize parasitic power loss in air cooling of Li‐ion battery for thermal management system design
Air‐cooling‐based battery thermal management system (BTMS) is a research hotspot for electric vehicles because of lower cost and simpler design. Past research works have immensely concentrated on the


Developing New Fitness Functions in Genetic Programming for Classification With Unbalanced Data
This paper aims to both highlight the limitations of the current GP approaches in this area and develop several new fitness functions for binary classification with unbalanced data and empirically show that these new Fitness functions evolve classifiers with good performance on both the minority and majority classes.
Genetic Programming for Classification with Unbalanced Data
This thesis proposes several new fitness functions in GP to perform cost adjustment between the minority and the majority classes, allowing the unbalanced data sets to be used directly in the learning process without sampling, and shows how multiple Pareto front classifiers can be combined into an ensemble where individual members vote on the class label.
Research on Credit Card Fraud Detection Model Based on Class Weighted Support Vector Machine
An improved SVM--Imbalance Class Weighted SVM (ICW-SVM) was adopted and it is demonstrated that this model is more suitable for solving credit card fraud detection problem with higher precision and effective than others.
Detecting credit card fraud by genetic algorithm and scatter search
Representing classification problems in genetic programming
The results show that the dynamic range selection method is well-suited to the task of multi-class classification and is capable of producing classifiers that are more accurate than the other methods tried when comparable training times are allowed.
Data mining in metric space: an empirical analysis of supervised learning performance criteria
A new metric is introduced, SAR, that combines squared error, accuracy, and ROC area into one metric, and MDS and correlation analysis shows that SAR is centrally located and correlates well with other metrics, suggesting that it is a good general purpose metric to use when more specific criteria are not known.
Identifying online credit card fraud using Artificial Immune Systems
This paper investigates the effectiveness of Artificial Immune Systems (AIS) for credit card fraud detection using a large dataset obtained from an on-line retailer and suggests that AIS algorithms have potential for inclusion in fraud detection systems but that further work is required to realize their full potential in this domain.
Strategies for learning in class imbalance problems
A set of examples or training set (TS) is said to be imbalanced if one of the classes is represented by a very small number of cases compared to the other classes. Following the common practice
The use of the area under the ROC curve in the evaluation of machine learning algorithms
Addressing the Curse of Imbalanced Training Sets: One-Sided Selection
Criteria to evaluate the utility of clas-siiers induced from such imbalanced training sets are discussed, explanation of the poor behavior of some learners under these circumstances is given, and a simple technique called one-sided selection of examples is suggested.