• Corpus ID: 119102244

METAQ: Bundle Supercomputing Tasks

@article{Berkowitz2017METAQBS,
  title={METAQ: Bundle Supercomputing Tasks},
  author={Evan Berkowitz},
  journal={arXiv: Computational Physics},
  year={2017}
}
  • E. Berkowitz
  • Published 20 February 2017
  • Computer Science
  • arXiv: Computational Physics
We describe a light-weight system of bash scripts for efficiently bundling supercomputing tasks into large jobs, so that one can take advantage of incentives or discounts for requesting large allocations. The software can backfill computational tasks, avoiding wasted cycles, and can streamline collaboration between different users. It is simple to use, functioning similarly to batch systems like PBS, MOAB, and SLURM. 

Figures from this paper

Three practical workflow schedulers for easy maximum parallelism

This work presents a complete characterization of the minimum effective task granularity for efficient scheduler usage scenarios, including simplicity of design, suitability for HPC centers, short startup time, and well‐understood per‐task overhead.

Job Management with mpi_jm

The library, mpi_jm, provides a flexible Python interface, unlocking many high-level libraries, while also tightly binding users’ executables to hardware.

Autonomous Resource Management for High Performance Datacenters

A library is developed to dynamically adjust the amount of resources used throughout the lifespan of a workflow, enabling elasticity for such applications in HPC datacenters, and an adaptive controller is defined to dynamically select the best method to perform runtime state synchronizations.

Application-aware resource management for datacenters

High Performance Computing (HPC) and Cloud Computing datacenters are extensively used to steer and solve complex problems in science, engineering, and business, such as calculating correlations and

Hybrid Resource Management for HPC and Data Intensive Workloads

The architecture of a hybrid system enabling dual-level scheduling for DI jobs in HPC infrastructures is presented, allowing efficient combination of hybrid workloads on HPC resources with increased job throughput and higher overall resource utilization.

Characterizing the Performance of Executing Many-tasks on Summit

The performance of executing many tasks using RP when interfaced with JSM and PRRTE on Summit is characterized and it is found thatPRRTE scales better than JSM for > O(1000) tasks; PRR TE overheads are negligible; and PR RTE supports optimizations that lower the impact of overheads and enable resource utilization of 63% when executing O(16K), 1-core tasks over 404 compute nodes.

EspressoDB: A scientific database for managing high-performance computing workflows

The framework provided by EspressoDB aims to support the ever increasing complexity of workflows of scientific computing at leadership computing facilities, with the goal of reducing the amount of human time required to manage the jobs, thus giving scientists more time to focus on science.

Simulating the Weak Death of the Neutron in a Femtoscale Universe with Near-Exascale Computing

  • E. BerkowitzM. Clark K. Orginos
  • Physics
    SC18: International Conference for High Performance Computing, Networking, Storage and Analysis
  • 2018
An improved algorithm that expoentially decreases the time-to-solution and an optimal application mapping through a job manager, which allows CPU and GPU jobs to be interleaved, yielding 15% of peak performance when deployed across large fractions of CORAL.

Scale setting the Möbius domain wall fermion on gradient-flowed HISQ action using the omega baryon mass and the gradient-flow scales t0 and w0

We report on a subpercent scale determination using the omega baryon mass and gradient-flow methods. The calculations are performed on 22 ensembles of N f ¼ 2 þ 1 þ 1 highly improved, rooted

Scale setting the M{ö}bius Domain Wall Fermion on gradient-flowed HISQ action using the omega baryon mass and the gradient-flow scale $w_0$

We report on a sub-percent scale determination using the omega baryon mass and gradient-flow methods. The calculations are performed on 22 ensembles of $N_f=2+1+1$ highly improved, rooted staggered

References

SHOWING 1-6 OF 6 REFERENCES

Simple Linux Utility for Resource Management

SLURM arbitrates conflicting requests for resouces by managing a queue of pending work and provides a framework for starting, executing, and monitoring work on the set of allciated nodes.

MPI: A message - passing interface standard

In rock drilling utilizing mechanical destruction of the rock and circulation of drilling fluid for removing debris from the cutting face, the drilling fluid is directed on to the cutting face in the

André Walker-Loud, mpi jm

  • 2017

Walker-Loud, mpi jm, in preparation

  • 2017

This work was supported in part by the Office of Science, Department of Energy, Office of Advanced Scientific Computing Research through the CalLat SciDAC3 grant under Award Number KB0301052

    METAQ

    • https://github.com/evanberkowitz/metaq
    • 2016