Learn More
To effectively manage large-scale data centers and utility clouds, operators must understand current system and application behaviors. This requires continuous, real-time monitoring along with on-line analysis of the data captured by the monitoring system, i.e., integrated monitoring and analytics -- Monalytics [28]. A key challenge with such integration is(More)
The online detection of anomalies is a vital element of operations in data centers and in utility clouds like Amazon EC2. Given ever-increasing data center sizes coupled with the complexities of systems software, applications, and workload patterns, such anomaly detection must operate automatically, at runtime, and without the need for prior knowledge about(More)
Online anomaly detection is an important step in data center management, requiring light-weight techniques that provide sufficient accuracy for subsequent diagnosis and management actions. This paper presents statistical techniques based on the Tukey and Relative Entropy statistics, and applies them to data collected from a production environment and to(More)
The emerging grids need an efficient replica location mechanism. In the experience of developing ChinaGrid supporting platform (CGSP), a grid middleware that builds a uniform platform supporting multiple grid-based applications, we meet a challenge of utilizing the properties of locality in replica location process to construct a practical and high(More)
To effectively manage large-scale data centers and utility clouds, operators must understand current system and application behaviors. This requires continuous monitoring along with online analysis of the data captured by the monitoring system. As a result, there is a need to move to systems in which both tasks can be performed in an integrated fashion,(More)
Bi-section bandwidth is a critical resource in today's data centers because of the high cost and limited bandwidth of higher-level network switches and routers. This problem is aggravated in virtualized environments where a set of virtual machines, jointly implementing some service, may run across multiple L2 hops. Since data center administrators typically(More)
Data-Intensive infrastructures are increasingly used for on-line processing of live data to guide operations and decision making. VScope is a flexible monitoring and analysis middleware for troubleshoot-ing such large-scale, time-sensitive, multi-tier applications. With VScope, lightweight anomaly detection and interaction tracking methods can be run(More)
This article reviews different kinds of models for the electric power grid that can be used to understand the modern power system, the smart grid. From the physical network to abstract energy markets, we identify in the literature different aspects that co-determine the spatio-temporal multilayer dynamics of power system. We start our review by showing how(More)
Data centers are growing in size and complexity driven by trends such as cloud computing and on-line services. Such large data centers pose several challenges for system management. Key among them is anomaly detection which is required to monitor and analyze metrics across several thousands servers and across multiple layers of abstractions to detect(More)