Bootstrapping In-Situ Workflow Auto-Tuning via Combining Performance Models of Component Applications
@article{Shu2020BootstrappingIW, title={Bootstrapping In-Situ Workflow Auto-Tuning via Combining Performance Models of Component Applications}, author={Tong Shu and Yanfei Guo and Justin M. Wozniak and Xiaoning Ding and Ian T Foster and Tahsin M. Kurç}, journal={SC21: International Conference for High Performance Computing, Networking, Storage and Analysis}, year={2020}, pages={1-15} }
In an in-situ workflow, multiple components such as simulation and analysis applications are coupled with streaming data transfers. The multiplicity of possible configurations necessitates an auto-tuner for workflow optimization. Existing auto-tuning approaches are computationally expensive because many configurations must be sampled by running the whole workflow repeatedly in order to train the auto-tuner surrogate model or otherwise explore the configuration space. To reduce these costs, we…
Figures and Tables from this paper
7 Citations
HPC Storage Service Autotuning Using Variational- Autoencoder -Guided Asynchronous Bayesian Optimization
- Computer Science2022 IEEE International Conference on Cluster Computing (CLUSTER)
- 2022
This work develops a novel variational-autoencoder-guided asynchronous Bayesian optimization method to tune HPC storage service parameters and shows that it is on par with state-of-the-art autotuning frameworks in speed and outperforms them in resource utilization and parallelization capabilities.
Distributed in-memory data management for workflow executions
- Computer SciencePeerJ Comput. Sci.
- 2021
SchalaDB is presented, an architecture with a set of design principles and techniques based on distributed in-memory data management for efficient workflow execution control and user steering and shown that even when running data analyses for user steering, SchalaDB’s overhead is negligible for workloads composed of hundreds of concurrent tasks on shared data.
Serving unseen deep learning models with near-optimal configurations: a fast adaptive search approach
- Computer ScienceSoCC
- 2022
Experiments show that Falcon can effectively reduce the search overhead for unseen DL models by up to 80% compared to state-of-the-art efforts.
Software Monsters: Quantifying, Reporting, and Controlling Composite Applications
- Computer Science
- 2022
It is proposed that fundamental software metrics can be brought into innovative programming models to address the construction and execution of scientific applications.
Practical Federated Learning Infrastructure for Privacy-Preserving Scientific Computing
- Computer Science2022 IEEE/ACM International Workshop on Artificial Intelligence and Machine Learning for Scientific Applications (AI4S)
- 2022
This paper identifies three missing pieces of a scientific FL infrastructure: a native MPI programming interface that can be seamlessly integrated into existing scientific applications, an independent data layer for the FL system such that the user can pick the persistent medium for her own choice, and efficient encryption protocols that are optimized for scientific workflows.
Adaptive elasticity policies for staging-based in situ visualization
- Future Generation Computer Systems
- 2022
Turbo: A Cost-Efficient Configuration Auto-Tuning Approach for Cluster-Based Big Data Frameworks
- BusinessSSRN Electronic Journal
- 2022
References
SHOWING 1-10 OF 63 REFERENCES
In-situ workflow auto-tuning through combining component models
- Computer SciencePPoPP
- 2021
An in-situ workflow auto-tuning method, ALIC, which integrates machine learning techniques with knowledge of in-Situ workflow structures to enable automated workflow configuration with a limited number of performance measurements is proposed.
Auto-tuning Parameter Choices in HPC Applications using Bayesian Optimization
- Computer Science2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
- 2020
The effectiveness of HiPerBOt is demonstrated in tuning parameters that include compiler flags, runtime settings, and application-level options for several parallel codes, including, Kripke, Hypre, LULESH, and OpenAtom.
Active-learning-based surrogate models for empirical performance tuning
- Computer Science2013 IEEE International Conference on Cluster Computing (CLUSTER)
- 2013
An iterative parallel algorithm is presented that builds surrogate performance models for scientific kernels and workloads on single-core and multicore and multinode architectures and tailor to the unique parallel environment an active learning heuristic popular in the literature on the sequential design of computer experiments in order to identify the code variants whose evaluations have the best potential to improve the surrogate.
Learning Cost-Effective Sampling Strategies for Empirical Performance Modeling
- Computer Science2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
- 2020
A novel parameter-value selection heuristic is proposed, which functions as a guideline for the experiment design, leveraging sparse performance-modeling, a technique that only needs a polynomial number of experiments per model parameter.
Autotuning in High-Performance Computing Applications
- Computer ScienceProceedings of the IEEE
- 2018
If autotuning is to be widely used in the HPC community, researchers must address the software engineering challenges, manage configuration overheads, and continue to demonstrate significant performance gains and portability across architectures.
In‐memory staging and data‐centric task placement for coupled scientific simulation workflows
- Computer ScienceConcurr. Comput. Pract. Exp.
- 2017
A distributed data sharing and task execution framework that co‐locates in‐memory data staging on application compute nodes to store data that needs to be shared or exchanged and uses data‐centric task placement to map computations onto processor cores that a large portion of the data exchanges can be performed using the intra‐node shared memory is presented.
Minimizing the cost of iterative compilation with active learning
- Computer Science2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)
- 2017
This work construct 11 high-quality models which use a combination of optimization settings to predict the runtime of benchmarks from the SPAPT suite, and is able to reduce the training overhead by up to 26x compared to an approach with a fixed number of sample runs.
DataSpaces: an interaction and coordination framework for coupled simulation workflows
- Computer ScienceHPDC '10
- 2010
DataSpaces essentially implements a semantically specialized virtual shared space abstraction that can be associatively accessed by all components and services in the application workflow and enables live data to be extracted from running simulation components, indexes this data online, and then allows it to be monitored, queried and accessed by other components and Services via the space using semantically meaningful operators.
Performance analysis and optimization of in-situ integration of simulation with data analysis: zipping applications up
- Computer ScienceHPDC
- 2018
This paper targets an important class of applications that requires combining HPC simulations with data analysis for online or real-time scientific discovery, and designs an end-to-end application-level approach to eliminating the interlocks and synchronizations existent in the present methods.
Bootstrapping Parameter Space Exploration for Fast Tuning
- Computer ScienceICS
- 2018
This paper proposes a novel bootstrap scheme, called GEIST, for parameter space exploration to find performance-optimizing configurations quickly and shows the effectiveness of GEIST for selecting application input options, compiler flags, and runtime/system settings for several parallel codes.