Learn More
MapReduce has become an important distributed processing model for large-scale data-intensive applications like data mining and web indexing. Hadoop-an open-source implementation of MapReduce is widely used for short jobs requiring low response time. The current Hadoop implementation assumes that computing nodes in a cluster are homogeneous in nature. Data(More)
Energy conservation is a major concern in cloud computing systems because it can bring several important benefits such as reducing operating costs, increasing system reliability, and prompting environmental protection. Meanwhile, power-aware scheduling approach is a promising way to achieve that goal. At the same time, many real-time applications, e.g.,(More)
Dynamic Voltage Scaling (DVS) is a key technique for embedded systems to exploit multiple voltage and frequency levels to reduce energy consumption and to extend battery life. There are many DVS-based algorithms proposed for periodic and aperiodic task models. However, there are few algorithms that support the sporadic task model. Moreover, existing(More)
Cluster storage systems are essential building blocks for many high-end computing infrastructures. Although energy conservation techniques have been intensively studied in the context of clusters and disk arrays, improving energy efficiency of cluster storage systems remains an open issue. To address this problem, we describe in this paper an approach to(More)
MapReduce has become an important distributed processing model for large-scale data-intensive applications like data mining and web indexing. Hadoop–an open-source implementation of MapReduce is widely used for short jobs requiring low response time. In this paper, We proposed a new preshuffling strategy in Hadoop to reduce high network loads imposed by(More)
Cluster computing has emerged as a primary and cost-effective platform for running parallel applications, including communication-intensive applications that transfer a large amount of data among the nodes of a cluster via the interconnection network. Conventional load balancers have proven effective in increasing the utilization of CPU, memory, and disk(More)
Load balancing for clusters has been investigated extensively, mainly focusing on the effective usage of global CPU and memory resources. However, previous CPU- or memory-centric load balancing schemes suffer significant performance drop under I/O-intensive workloads due to the imbalance of I/O load. To solve this problem, we propose two simple yet(More)
In the past decade, parallel disk systems have been developed to address the problem of I/O performance. A critical challenge with modern parallel I/O systems is that parallel disks consume a significant amount of energy in servers and high performance computers. To conserve energy consumption in parallel I/O systems, one can immediately spin down disks(More)
An increasing number of commodity clusters are connected to each other by public networks, which have become a potential threat to security sensitive parallel applications running on the clusters. To address this security issue, we developed a Message Passing Interface (MPI) implementation to preserve confidentiality of messages communicated among nodes of(More)
Energy efficient computing is becoming increasingly important as the scale of parallel computing systems is expanding. As the processing power of parallel computing systems has been incremented there has been an increased demand for large scale storage systems to store the output of these parallel computing systems. Data centers are growing at an enormous(More)