• Publications
  • Influence
A distributed dynamic load balancer for iterative applications
  • H. Menon, L. Kalé
  • Computer Science
  • SC - International Conference for High…
  • 17 November 2013
TLDR
This paper describes a fully distributed algorithm for load balancing that uses partial information about the global state of the system to perform load balancing. Expand
  • 49
  • 3
  • PDF
Automated Load Balancing Invocation Based on Application Characteristics
TLDR
We propose the Meta-Balancer framework which relieves the application programmers of deciding when to balance load. Expand
  • 30
  • 2
  • PDF
Multi-Level Load Balancing with an Integrated Runtime Approach
TLDR
We propose an integrated runtime system that combines the Charm++ distributed programming model with concurrent tasks to mitigate load imbalances within and across shared memory address spaces. Expand
  • 13
  • PDF
Scalable replay with partial-order dependencies for message-logging fault tolerance
TLDR
In this paper, we present a novel algebraic framework for reasoning about the minimum dependencies required to represent the partial order for different orderings and interleavings. Expand
  • 15
  • PDF
POSTER: Automated Load Balancer Selection Based on Application Characteristics
TLDR
We propose Meta-Balancer, a framework to automatically decide the best load balancing strategy. Expand
  • 5
  • PDF
Power, Reliability, and Performance: One System to Rule them All
TLDR
In a design based on the Charm++ parallel programming framework, an adaptive runtime system dynamically interacts with a datacenter's resource manager to control power by intelligently scheduling jobs, reallocating resources, and reconfiguring hardware. Expand
  • 8
Thermal aware automated load balancing for HPC applications
TLDR
We propose an adaptive control system that minimizes the cooling energy by using Dynamic Voltage and Frequency Scaling to control the temperature and performing load balancing. Expand
  • 17
  • PDF
Applying graph partitioning methods in measurement-based dynamic load balancing
TLDR
This paper explores the use of graph partitioning algorithms, traditionally used for partitioning physical domains/meshes, for measurement-based dynamic load balancing of parallel applications. Expand
  • 13
  • PDF
DisCVar: discovering critical variables using algorithmic differentiation for transient faults
TLDR
We present a full-coverage, systematic methodology called DisCVar to identify critical variables in HPC applications for protection against SDC. Expand
  • 5
Integrating OpenMP into the Charm++ Programming Model
TLDR
In this paper, we propose a new integrated runtime system that adds OpenMP shared-memory parallelism to the Charm++ distributed programming model to improve load balancing on distributed systems. Expand
  • 4
  • PDF