Kenny C. Gross

Learn More
Designing thermal management strategies that reduce the impact of hot spots and on-die temperature variations at low performance cost is a very significant challenge for multiprocessor system-on-chips (MPSoCs). In this work, we present a proactive MPSoC thermal management approach, which predicts the future temperature and adjusts the job allocation on the(More)
Conventional thermal management techniques are reactive, as they take action after temperature reaches a threshold. Such approaches do not always minimize and balance the temperature, and they control temperature at a noticeable performance cost. This paper investigates how to use predictors for forecasting temperature and workload dynamics, and proposes(More)
In deep submicron circuits, thermal hot spots and high temperature gradients increase the cooling costs, and degrade reliability and performance. In this paper, we propose a low-cost temperature management strategy for multicore systems to reduce the adverse effects of hot spots and temperature variations. Our technique utilizes online learning to select(More)
Thermal hot spots and high temperature gradients degrade reliability and performance, and increase cooling costs and leakage power. In this paper, we explore the benefits of temperature-aware task scheduling for multiprocessor system-on-a-chip (MPSoC). We evaluate our techniques using workload characteristics collected from a real system by Sun's Continuous(More)
Preventing thermal hot spots and large temperature variations on the die is critical for addressing the challenges in system reliability, performance, cooling cost and leakage power. Reactive thermal management methods, which take action after temperature reaches a given threshold, maintain the temperature below a critical level at the cost of performance,(More)
This paper presents a real-time machine learning technique that has been adapted from the field of statistical process control (SPC) to give early annunciation of incipient anomalies in signals and processes involving enterprise computing systems and associated networks. A binary-hypothesis technique called the sequential probability ratio test (SPRT)(More)
Thermal hot spots and temperature gradients on the die need to be minimized to manufacture reliable systems while meeting energy and performance constraints. In this work, we solve the task scheduling problem for multiprocessor system-on-chips (MPSoCs) using Integer Linear Programming (ILP). The goal of our optimization is minimizing the hot spots and(More)
Software aging is a phenomenon, usually caused by resource contention, that can cause mission critical and business critical computer systems to hang, panic, or suffer performance degradation. If the incipience or onset of software aging mechanisms can be reliably detected in advance of performance degradation, corrective actions can be taken to prevent(More)