Using meta-heuristics and machine learning for software optimization of parallel computing systems: a systematic literature review

  title={Using meta-heuristics and machine learning for software optimization of parallel computing systems: a systematic literature review},
  author={Suejb Memeti and Sabri Pllana and Al{\'e}cio Pedro Delazari Binotto and Joanna Kolodziej and Ivona Brandi{\'c}},
While modern parallel computing systems offer high performance, utilizing these powerful computing resources to the highest possible extent demands advanced knowledge of various hardware architectures and parallel programming models. Furthermore, optimized software execution on parallel computing systems demands consideration of many parameters at compile-time and run-time. Determining the optimal set of parameters in a given execution context is a complex task, and therefore to address this… 

Optimization of Heterogeneous Systems with AI Planning Heuristics and Machine Learning: A Performance and Energy Aware Approach

An approach that combines AI planning heuristics for parameter space exploration with a machine learning model for performance and energy evaluation to determine a near-optimal system configuration for data-parallel applications is presented.

Metaheuristics and Software Engineering: Past, Present, and Future

This work aims at giving an updated vision on the successful combination between Metaheuristics and Software Engineering (SE), and to build intelligent automatic tools that will upgrade the quality of software products and services.

HSTREAM: A Directive-Based Language Extension for Heterogeneous Stream Computing

  • Suejb MemetiS. Pllana
  • Computer Science
    2018 IEEE International Conference on Computational Science and Engineering (CSE)
  • 2018
Experimental evaluation results show that HSTREAM can keep the same programming simplicity as OpenMP, and the generated code can deliver performance beyond what CPUs-only and GPUs-only executions can deliver.

Two-level utilization-based processor allocation for scheduling moldable jobs

This research work on developing new processor allocation approaches for moldable job scheduling based on two-level resource utilization calculation, preemptive job execution, and dual-criteria iterative improvement demonstrates significant performance improvement in terms of average turnaround time.

Construction of Artistic Design Patterns Based on Improved Distributed Data Parallel Computing of Heterogeneous Tasks

  • Yao Sun
  • Computer Science
    Mathematical Problems in Engineering
  • 2022
An in-depth analysis and research is provided on the construction and application of improved models using the artistic design pattern of heterogeneous tasks and parallel computing to reduce the complexity of data allocation and processing for users.

Using Structured Input and Modularity for Improved Learning

The method has the effect of modularizing the neural network which helps break down complexity, and results in more efficient training of the overall network.

Model-Based Extraction of Knowledge about the Effect of Cloud Application Context on Application Service Cost and Quality of Service

This paper presents a model-based approach aimed at the providers of applications hosted in the cloud, applicable in early phases of the service lifecycle and can be used for any cloud application service.



Meta optimization: improving compiler heuristics with machine learning

By evolving a compiler's heuristic over several benchmarks, Meta Optimization can create effective, general-purpose heuristics, and demonstrates the efficacy of the techniques on three different optimizations in this paper: hyperblock formation, register allocation, and data prefetching.

Combinatorial Optimization of Work Distribution on Heterogeneous Systems

  • Suejb MemetiS. Pllana
  • Computer Science
    2016 45th International Conference on Parallel Processing Workshops (ICPPW)
  • 2016
We describe an approach that uses combinatorial optimization and machine learning to share the work between the host and device of heterogeneous computing systems such that the overall application

Mapping parallelism to multi-cores: a machine learning based approach

A portable and automatic compiler-based approach to mapping such parallelism using machine learning develops two predictors: a data sensitive and a data insensitive predictor to select the best mapping for parallel programs.

Using machine learning to focus iterative optimization

A new methodology is developed that uses predictive modelling from the domain of machine learning to automatically focus search on those areas likely to give greatest performance, independent of search algorithm, search space or compiler infrastructure and scales gracefully with the compiler optimization space size.

A Machine Learning Approach to Automatic Production of Compiler Heuristics

Achieving high performance on modern processors heavily relies on the compiler optimizations to exploit the microprocessor architecture. The efficiency of optimization directly depends on the

Using machine learning to partition streaming programs

This work develops a portable and automatic compiler-based approach to partitioning streaming programs using machine learning that predicts the ideal partition structure for a given streaming application using prior knowledge learned offline.

Intelligent Heuristic Construction with Active Learning

This work presents a low-cost predictive modelling approach for automatic heuristic construction which significantly reduces this training overhead, and shows that at high levels of classification accuracy the average learning speed-up is 3x, as compared to the state-of-the-art.

A Comparison of Eleven Static Heuristics for Mapping a Class of Independent Tasks onto Heterogeneous Distributed Computing Systems

It is shown that for the cases studied here, the relatively simple Min?min heuristic performs well in comparison to the other techniques, and one even basis for comparison and insights into circumstances where one technique will out-perform another.

ACME: adaptive compilation made efficient

A technique called virtual execution is developed that runs the program a single time and preserves information that allows us to accurately predict the performance of different optimization sequences without running the code again.

Observations on Using Genetic Algorithms for Dynamic Load-Balancing

This work investigates how a genetic algorithm can be employed to solve the dynamic load-balancing problem whereby optimal or near-optimal task allocations can "evolve" during the operation of the parallel computing system.