Temperature management in data centers: why some (might) like it hot

@article{ElSayed2012TemperatureMI,
  title={Temperature management in data centers: why some (might) like it hot},
  author={Nosayba El-Sayed and Ioan A. Stefanovici and George Amvrosiadis and Andy A. Hwang and Bianca Schroeder},
  journal={Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems - SIGMETRICS '12},
  year={2012}
}
The energy consumed by data centers is starting to make up a significant fraction of the world's energy consumption and carbon emissions. A large fraction of the consumed energy is spent on data center cooling, which has motivated a large body of work on temperature management in data centers. Interestingly, a key aspect of temperature management has not been well understood: controlling the setpoint temperature at which to run a data center's cooling system. Most data centers set their… 
Thermal Modeling and Management of Storage Systems
TLDR
This dissertation develops an approach to generate thermal models for estimating temperatures of processors, disks, and data nodes, and proposes thermal management strategies for building energy-efficient data centers, including a thermal-aware task scheduling strategy, thermal- Aware data placement strategies for homogeneous and hybrid storage clusters, and a predictive thermal- aware data transmission strategy.
Thermal Management and Data Archiving in Data Centers
TLDR
The experimental results show that TERN provides a simple yet powerful solution for resource provisioning in thermal-aware data centers where exist rapidly changing workload conditions and TERN judiciously adjusts the models to maintain prediction accuracy for dynamically changing request patterns.
Hidden Storage in Data Centers: Gaining Flexibility Through Cooling Systems
TLDR
A novel methodology is proposed that allows data center operators to compute the flexibility of the cooling system by modeling it as an Energy Storage System (ESS) by derivating a recursive formulation for the temperature set-points and verifying it empirically through a real-world data set.
A testbed and data yields for studying data center energy efficiency and reliability
TLDR
An extensive test plan is being executed to understand the impact of environmental conditions on server's computing performance and reliability as well as the energy efficiency of the testbed, which will provide important guidelines for building and operating energy-efficient data centers.
Thermal-Efficiency Benchmark on High-Performance Clusters
TLDR
Experimental results show that thermal efficiency benchmark ThermoBench provides a simple yet powerful benchmark solution for assessing thermal behaviors of computing clusters in data centers.
Virtual Melting Temperature: Managing Server Load to Minimize Cooling Overhead with Phase Change Materials
TLDR
VMT is proposed, a thermal aware job placement technique that adds an active, tunable component to enable greater control over datacenter thermal output and reduces peak cooling load by up to 12.8% to provide over two million dollars in cost savings when a smaller cooling system is installed.
Air Flow Measurement and Management for Improving Cooling and Energy Efficiency in Raised-Floor Data Centers: A Survey
TLDR
An overview for current endeavors to improve the air cooling efficiency is provided according to the locations where they can be applied from the perspective of air flow cycle and the thermal measurement issues are discussed.
Thermal-Aware Hybrid Workload Management in a Green Datacenter towards Renewable Energy Utilization
TLDR
A thermal-aware workload management method to maximize the utilization of renewable energy sources, considering the power consumption of both computing devices and cooling devices at the same time is proposed.
Energy wasting at internet data centers due to fear
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 53 REFERENCES
The effect of data center temperature on energy efficiency
  • M. Patterson
  • Physics
    2008 11th Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems
  • 2008
Server's capabilities are increasing at or beyond the rate of performance improvement gains predicted by Moore's Law for the silicon itself. The challenge for the information technology (IT) owner is
Smart cooling of data centers
A cooling system is configured to adjust cooling fluid flow to various racks located throughout a data center based upon the detected or anticipated temperatures at various locations throughout the
On evaluating request-distribution schemes for saving energy in server clusters
  • K. Rajamani, C. Lefurgy
  • Computer Science
    2003 IEEE International Symposium on Performance Analysis of Systems and Software. ISPASS 2003.
  • 2003
TLDR
This work measures a web cluster running an industry-standard commercial web workload to demonstrate that understanding this system-workload context is critical to performing valid evaluations and even for improving the energy-saving schemes.
Failure Trends in a Large Disk Drive Population
TLDR
It is found that temperature and activity levels were much less correlated with drive failures than previously reported, and models based on SMART parameters alone are unlikely to be useful for predicting individual drive failures.
Balance of power: dynamic thermal management for Internet data centers
Internet-based applications and their resulting multitier distributed architectures have changed the focus of design for large-scale Internet computing. Internet server applications execute in a
GREEN GRID DATA CENTER POWER EFFICIENCY METRICS: PUE AND DCIE
TLDR
The use of PUE is re-affirmed but its reciprocal, Datacenter Effi ciency (DCiE), will avoid much of the confusion around DCE and will now be called DCiE.
Optimal power allocation in server farms
TLDR
The analysis shows that the optimal power allocation is non-obvious and depends on many factors such as the power-to-frequency relationship in the processors, the arrival rate of jobs, the maximum server frequency, the lowest attainable server frequency and the server farm configuration.
Vertigo: automatic performance-setting for Linux
TLDR
The implementation and performance-setting algorithms used in Vertigo, the authors' power management extensions for Linux, are described and it is shown that unlike conventional interval-based algorithms like LongRun, Vertigo is successful at focusing in on a small range of performance levels that are sufficient to meet an application's deadlines.
A Large-Scale Study of Failures in High-Performance Computing Systems
TLDR
Analysis of failure data collected at two large high-performance computing sites finds that average failure rates differ wildly across systems, ranging from 20-1000 failures per year, and that time between failures is modeled well by a Weibull distribution with decreasing hazard rate.
Load balancing and unbalancing for power and performance in cluster-based systems
TLDR
The approach is to develop systems that dynamically turn cluster nodes on – to be able to handle the load imposed on the system efficiently – and off – to save power under lighter load.
...
1
2
3
4
5
...