Learn More
We present Resilient Distributed Datasets (RDDs), a distributed memory abstraction that lets programmers perform in-memory computations on large clusters in a fault-tolerant manner. RDDs are motivated by two types of applications that current computing frameworks handle inefficiently: iterative algorithms and interactive data mining tools. In both cases,(More)
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to(More)
—Recently network virtualization has been proposed as a promising way to overcome the current ossification of the Internet by allowing multiple heterogeneous virtual networks (VNs) to coexist on a shared infrastructure. A major challenge in this respect is the VN embedding problem that deals with efficient mapping of virtual nodes and virtual links onto the(More)
Network virtualization allows multiple heterogeneous virtual networks (VNs) to coexist on a shared infrastructure. Efficient mapping of virtual nodes and virtual links of a VN request onto substrate network resources, also known as the VN embedding problem, is the first step toward enabling such multiplicity. Since this problem is known to be -hard,(More)
Cluster computing applications like MapReduce and Dryad transfer massive amounts of data between their computation stages. These transfers can have a significant impact on job performance, accounting for more than 50% of job completion times. Despite this impact, there has been relatively little work on optimizing the performance of these data transfers,(More)
Intra-domain virtual network embedding is a well studied problem in the network virtualization literature. For most practical purposes, however, virtual networks (VNs) must be provisioned across heterogeneous administrative domains managed by multiple infrastructure providers (InPs). In this paper we present PolyViNE, a policy-based inter-domain VN(More)
Due to the existence of multiple stakeholders with conflicting goals and policies, alterations to the existing Internet architecture are now limited to simple incremental updates; deployment of any new, radically different technology is next to impossible. To fend off this ossification, network virtualization has been propounded as a diversifying attribute(More)
Datacenter networks have been designed to tolerate failures of network equipment and provide sufficient bandwidth. In practice, however, failures and maintenance of networking and power equipment often make tens to thousands of servers unavailable, and network congestion can increase service latency. Unfortunately, there exists an inherent tradeoff between(More)
Cluster computing applications -- frameworks like MapReduce and user-facing applications like search platforms -- have application-level requirements and higher-level abstractions to express them. However, there exists no networking abstraction that can take advantage of the rich semantics readily available from these data parallel applications. We propose(More)