Automated Evolutionary Approach for the Design of Composite Machine Learning Pipelines

  title={Automated Evolutionary Approach for the Design of Composite Machine Learning Pipelines},
  author={Nikolay O. Nikitin and Pavel Vychuzhanin and Mikhail Sarafanov and Iana S. Polonskaia and Ilia Revin and Irina V. Barabanova and Gleb Maximov and Anna V. Kaluzhnaya and Alexander Boukhanovsky},

Evolutionary Automated Machine Learning for Multi-Scale Decomposition and Forecasting of Sensor Time Series

The iterative data decomposition algorithm is proposed in the paper to improve the quality of the sensor time series forecasting and the boosting-like mutation operators have been implemented for graphs-based genotypes.

Improvement of Computational Performance of Evolutionary AutoML in a Heterogeneous Environment

A modular approach that can be used to increase the quality of evolutionary optimization for modelling pipelines with a graph-based structure is proposed that consists of several stages - parallelization, caching and evaluation.

Challenges and Practices of Deep Learning Model Reengineering: A Case Study on Computer Vision

This study describes how deep learning-based computer vision techniques are reengineered, analyzes the distribution of defects in this process, and proposes a novel reengineering workflow.

On the balance between the training time and interpretability of neural ODE for time series modelling

The paper shows that modern neural ODE cannot be reduced to simpler models for time-series modelling applications, and proposes a new view on time- series modelling using combined neural networks and ODE systems approach.

Short-Term River Flood Forecasting Using Composite Models and Automated Machine Learning: The Case Study of Lena River

The paper presents a hybrid approach for short-term river flood forecasting. It is based on multi-modal data fusion from different sources (weather stations, water height sensors, remote sensing

The development of an electrochemical sensor for antibiotics in milk based on machine learning algorithms

A combination of cyclic voltammetry facilities and machine learning technique made it possible to create a pattern recognition system for antibiotic residues in skimmed milk and Gradient boosting algorithm showed the best efficiency towards training the machine learning model.

Machine learning-based wind speed time series analysis

In this study, hourly average wind speed data covering the years 2019, 2020, and 2021 in California were used to perform a time series analysis and forecasting utilizing one of the AutoML tools, Fedot.

MatFlow: A System for Knowledge-based Novel Materials Design using Machine Learning

A new machine learning platform, called MatFlow, is introduced for automated and knowledge driven design of novel materials and their usage and its functionality is illustrated with an application in Transition Metal Dichalcogenide Heterostructures design of electronic and energy devices.



Incremental Search Space Construction for Machine Learning Pipeline Synthesis

A data-centric approach based on meta-features for pipeline construction and hyperparameter optimization inspired by human behavior is proposed, which is able to prune the pipeline structure search space efficiently and flexible and data set specific ML pipelines can be constructed.

Multi-Objective Evolutionary Design of Composite Data-Driven Models

A multi-objective approach for the design of composite data-driven mathematical models that allows automating the identification of graph-based heterogeneous pipelines that consist of different blocks: machine learning models, data preprocessing blocks, etc.

DarwinML: A Graph-based Evolutionary Algorithm for Automated Machine Learning

A graph-based architecture is employed to represent flexible combinations of ML models, which provides a large searching space compared to tree-based and stacking-based architectures, and an evolutionary algorithm is proposed to search for the best architecture.

TPOT: A Tree-based Pipeline Optimization Tool for Automating Machine Learning

This chapter presents TPOT v0.3, an open source genetic programming-based AutoML system that optimizes a series of feature preprocessors and machine learning models with the goal of maximizing classification accuracy on a supervised classification task.

DeepLine: AutoML Tool for Pipelines Generation using Deep Reinforcement Learning and Hierarchical Actions Filtering

This study presents DeepLine, a reinforcement learning-based approach for automatic pipeline generation that utilizes an efficient representation of the search space together with a novel method for operating in environments with large and dynamic action spaces.

Benchmark and Survey of Automated Machine Learning Frameworks

This paper is a combination of a survey on current AutoML methods and a benchmark of popular AutoML frameworks on real data sets to summarize and review important AutoML techniques and methods concerning every step in building an ML pipeline.

Towards Generative Design of Computationally Efficient Mathematical Models with Evolutionary Learning

The concept of generative design approach applied to the automated evolutionary learning of mathematical models in a computationally efficient way is described and the involvement of the performance models in the design process is analyzed.

Auto-sklearn: Efficient and Robust Automated Machine Learning

A robust new AutoML system based on the Python machine learning package scikit-learn, which improves on existing AutoML methods by automatically taking into account past performance on similar datasets, and by constructing ensembles from the models evaluated during the optimization.

The data-driven physical-based equations discovery using evolutionary approach

The algorithm for the mathematical equations discovery from the given observations data is described, which combines genetic programming with the sparse regression and results in a short and interpretable expression that describes the physical process that lies beyond the data.