Fair feature subset selection using multiobjective genetic algorithm

@article{Rehman2022FairFS,
  title={Fair feature subset selection using multiobjective genetic algorithm},
  author={Ayaz Ur Rehman and Anas Nadeem and Muhammad Zubair Malik},
  journal={Proceedings of the Genetic and Evolutionary Computation Conference Companion},
  year={2022}
}
The feature subset selection problem aims at selecting the relevant subset of features to improve the performance of a Machine Learning (ML) algorithm on training data. Some features in data can be inherently noisy, costly to compute, improperly scaled, or correlated to other features, and they can adversely affect the accuracy, cost, and complexity of the induced algorithm. The goal of traditional feature selection approaches has been to remove such irrelevant features. In recent years ML is… 

Figures and Tables from this paper

References

SHOWING 1-10 OF 19 REFERENCES

XGBoost: A Scalable Tree Boosting System

This paper proposes a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning and provides insights on cache access patterns, data compression and sharding to build a scalable tree boosting system called XGBoost.

A fast and elitist multiobjective genetic algorithm: NSGA-II

This paper suggests a non-dominated sorting-based MOEA, called NSGA-II (Non-dominated Sorting Genetic Algorithm II), which alleviates all of the above three difficulties, and modify the definition of dominance in order to solve constrained multi-objective problems efficiently.

People underestimate the errors made by algorithms for credit scoring and recidivism prediction but accept even fewer errors

This study provides the first representative analysis of error estimations and willingness to accept errors in a Western country (Germany) with regards to algorithmic decision-making systems (ADM) and concludes that people have unwarranted expectations about the performance of ADM systems.

"Ignorance and Prejudice" in Software Fairness

  • J ZhangM. Harman
  • Computer Science
    2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE)
  • 2021
It is found that enlarging the feature set plays a significant role in fairness (with an average effect rate of 38%), and contrary to widely-held beliefs that greater fairness often corresponds to lower accuracy, the findings reveal that an enlarged feature set has both higher accuracy and fairness.

A Survey on Bias and Fairness in Machine Learning

This survey investigated different real-world applications that have shown biases in various ways, and created a taxonomy for fairness definitions that machine learning researchers have defined to avoid the existing bias in AI systems.

Aequitas: A Bias and Fairness Audit Toolkit

Aequitas is an open source bias and fairness audit toolkit that is an intuitive and easy to use addition to the machine learning workflow, enabling users to seamlessly test models for several bias and Fairness metrics in relation to multiple population sub-groups.

AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias

A new open source Python toolkit for algorithmic fairness, AI Fairness 360 (AIF360), released under an Apache v2.0 license to help facilitate the transition of fairness research algorithms to use in an industrial setting and to provide a common framework for fairness researchers to share and evaluate algorithms.

NEU: A Meta-Algorithm for Universal UAP-Invariant Feature Representation

This paper introduces a meta-procedure, called Non-Euclidean Upgrading (NEU), which learns feature maps that are expressive enough to embed the universal approximation property (UAP) into most model classes while only outputting feature maps That preserve any model class’s UAP.

Optimal Feature Selection via NSGA-II for Power Quality Disturbances Classification

This paper presents an application of nondominated sorting genetic algorithm II (NSGA-II) for multiobjective feature selection in power quality disturbances classification and shows quick convergence, admirable accuracy, and reduced computational time.