Learn More
—Heterogeneous servers are becoming prevalent in many high-performance computing environments, including clusters and datacenters. In this paper, we consider multi-objective scheduling for heterogeneous server systems to optimize simultaneously the application performance, energy consumption and thermal imbalance. First, a greedy online framework is(More)
Distributed software environments are increasingly complex and difficult to manage, as they integrate various legacy software with specific management interfaces. Moreover, the fact that management tasks are performed by humans leads to many configuration errors and low reactivity. This is particularly true in medium or large-scale distributed(More)
—Modern high performance computing subsystems (HPC) – including processor, network, memory, and IO – are provided with power management mechanisms. These include dynamic speed scaling and dynamic resource sleeping. Understanding the behavioral patterns of high performance computing systems at runtime can lead to a multitude of optimization opportunities(More)
—The rising computing demands of scientific endeavours often require the creation and management of High Performance Computing (HPC) systems for running experiments and processing vast amounts of data. These HPC systems generally operate at peak performance, consuming a large quantity of electricity, even though their workload varies over time.(More)
OATAO is an open access repository that collects the work of Toulouse researchers and makes it freely available over the web where possible. Abstract. Nowadays, there is no doubt that energy consumption has become a limiting factor in the design and operation of high performance computing (HPC) systems. This is evidenced by the rise of efforts both from the(More)
Energy usage is becoming a challenge for the design of next generation large scale distributed systems. This paper explores an innovative approach of profiling such systems. It proposes a DNA-like solution without making any assumptions on the running applications and used hardware. This profiling based on internal counters usage and energy monitoring(More)
The growing complexity of large IT facilities involves important time and effort costs to operate and maintain. Autonomic computing gives a new approach in designing distributed architectures that manage themselves in accordance with high-level objectives. The main issue is that existing architectures do not necessarily follow this new approach. The(More)