Can Genetic Programming Do Manifold Learning Too?

  title={Can Genetic Programming Do Manifold Learning Too?},
  author={Andrew Lensen and Bing Xue and Mengjie Zhang},
© Springer Nature Switzerland AG 2019. Exploratory data analysis is a fundamental aspect of knowledge discovery that aims to find the main characteristics of a dataset. Dimensionality reduction, such as manifold learning, is often used to reduce the number of features in a dataset to a manageable level for human interpretation. Despite this, most manifold learning techniques do not explain anything about the original features nor the true characteristics of a dataset. In this paper, we propose… 

Multi-objective genetic programming for manifold learning: balancing quality and dimensionality

This paper substantially extends previous work on manifold learning, by introducing a multi-objective approach that automatically balances the competing objectives of manifold quality and dimensionality.

Genetic Programming for Manifold Learning: Preserving Local Topology

This work proposes a new approach to using genetic programming for MaL, which preserves local topology, and finds that it often outperforms other methods, including a clear improvement over previous genetic programming approaches.

Benchmarking Manifold Learning Methods on a Large Collection of Datasets

It is shown that GP-based methods can more effectively learn a manifold across a set of 155 different problems and deliver more separable embeddings than many established methods.

Using Genetic Programming to Find Functional Mappings for UMAP Embeddings

This work proposes utilising UMAP to create functional mappings with genetic programming-based manifold learning and compares two different approaches: one that uses the embedding produced by UMAP as the target for the functional mapping; and the other which directly optimises the UMAP cost function by using it as the fitness function.

On genetic programming representations and fitness functions for interpretable dimensionality reduction

It is found that various GP methods can be competitive with state-of-the-art DR algorithms and that they have the potential to produce interpretable DR mappings.

Evolutionary Feature Manipulation in Unsupervised Learning

This thesis provides the first comprehensive investigation into the use of EC-based feature manipulation for unsupervised learning tasks, and clearly shows the ability of evolutionary feature manipulation to improve both the performance of algorithms and interpretability of solutions in unsuper supervised learning tasks.

Lizard Brain: Tackling Locally Low-Dimensional Yet Globally Complex Organization of Multi-Dimensional Datasets

This work reviews modern machine learning approaches for extracting low-dimensional geometries from multi-dimensional data and their applications in various scientific fields.

Image Feature Learning with Genetic Programming

This paper presents Genetic Program Feature Learner (GPFL), a novel generative GP feature learner for 2D images that drastically outperforms LeNet5 when considering noisy images as test sets, and compared it with the convolutional neural network Le net5.

Genetic Programming for Evolving a Front of Interpretable Models for Data Visualization

A genetic programming (GP) approach called GP-tSNE is proposed for evolving interpretable mappings from the dataset to high-quality visualizations and a multiobjective approach is designed that produces a variety of visualizations in a single run which gives different tradeoffs between visual quality and model complexity.

Genetic Programming: 23rd European Conference, EuroGP 2020, Held as Part of EvoStar 2020, Seville, Spain, April 15–17, 2020, Proceedings

A method for imputation predictor selection using regularized genetic programming (GP) models is presented for symbolic regression tasks on incomplete data and a complexity measure based on the Hessian matrix of the phenotype of the evolving models is proposed.



Structurally Layered Representation Learning: Towards Deep Learning Through Genetic Programming

A structurally layered GP formulation is introduced, together with an efficient scheme to explore the search space and it is shown that this framework can be used to learn representations from large data sets of high dimensional raw data.

Representation Learning: A Review and New Perspectives

Recent work in the area of unsupervised feature learning and deep learning is reviewed, covering advances in probabilistic models, autoencoders, manifold learning, and deep networks.

New Representations in Genetic Programming for Feature Construction in k-Means Clustering

Novel representations for using Genetic Programming to perform feature construction to improve the clustering performance of the k-means algorithm are proposed.

Evolving Unsupervised Deep Neural Networks for Learning Meaningful Representations

This paper proposes a computationally economical algorithm for evolving unsupervised deep neural networks to efficiently learn meaningful representations, which is very suitable in the current big data era where sufficient labeled data for training is often expensive to acquire.

Nonlinear dimensionality reduction by locally linear embedding.

Locally linear embedding (LLE) is introduced, an unsupervised learning algorithm that computes low-dimensional, neighborhood-preserving embeddings of high-dimensional inputs that learns the global structure of nonlinear manifolds.

Genetic programming for feature construction and selection in classification on high-dimensional data

This work presents a comprehensive study to investigate the use of genetic programming using a tree-based representation for feature construction and selection on high-dimensional classification problems and shows that the constructed and/or selected feature sets can significantly reduce the dimensionality and maintain or even increase the classification accuracy in most cases.

Automatically evolving difficult benchmark feature selection datasets with genetic programming

This work develops a method for producing complex multi-variate redundancies, and presents a novel and intuitive approach to ensuring a range of redundancy relationships are automatically created.

Visualizing Data using t-SNE

A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.

Multi-objective genetic programming for feature extraction and data visualization

A Pareto-based multi-objective genetic programming algorithm for feature extraction and data visualization designed to obtain data transformations that optimize the classification and visualization performance both on balanced and imbalanced data.

A Filter Approach to Multiple Feature Construction for Symbolic Learning Classifiers Using Genetic Programming

This paper takes a nonwrapper approach by introducing a filter-based measure of goodness for constructed features in GPMFC, a multiple-feature construction system for classification problems using genetic programming (GP) to improve the classification performance in rule-based and decision tree classifiers.