Learn More
The initial design of Apache Hadoop [1] was tightly focused on running massive, MapReduce jobs to process a web crawl. For increasingly diverse companies, Hadoop has become the <i>data and computational agor&#225;</i>---the de facto place where data and computational resources are shared and accessed. This broad adoption and ubiquitous usage has stretched(More)
The continuous shift towards data-driven approaches to business, and a growing attention to improving return on investments (ROI) for cluster infrastructures is generating new challenges for big-data frameworks. Systems originally designed for big batch jobs now handle an increasingly complex mix of computations. Moreover, they are expected to guarantee(More)
We have exact formulas for the number of tilings of only a small number of regions: Aztec Diamonds , Fortresses, and regions composed of Lozenge Tiles. However, these regions, particularly aztec diamonds and lozenge tilings are fundamental, and numerous other regions types can be reduced to weighted versions of these graphs using urban renewal techniques.(More)
Data-intensive computing (DISC) frameworks scale by partitioning a <i>job</i> across a set of fault-tolerant <i>tasks</i>, then diffusing those tasks across large clusters. Multi-tenanted clusters must accommodate service-level objectives (SLO) in their resource model, often expressed as a maximum latency for allocating the desired set of resources to every(More)
Datacenter-scale computing for analytics workloads is increasingly common. High operational costs force heterogeneous applications to share cluster resources for achieving economy of scale. Scheduling such large and diverse workloads is inherently hard, and existing approaches tackle this in two alternative ways: 1) centralized solutions offer strict,(More)
Walnut is an object-store being developed at Yahoo! with the goal of serving as a common low-level storage layer for a variety of cloud data management systems including Hadoop (a MapReduce system), MObStor (a multimedia serving system), and PNUTS (an extended key-value serving system). Thus, a key performance challenge is to meet the latency and throughput(More)
We present a framework for the analysis and synthesis of acoustical instruments based on data-driven probabilistic inference modeling. Audio time series and boundary conditions of a played instrument are recorded and the non-linear mapping from the control data into the audio space is inferred using the general inference framework of Cluster-Weighted(More)
Architectural recovery techniques analyze a software system's implementation-level artifacts to suggest its likely architecture. However, different techniques will often suggest different architectures for the same system, making it difficult to interpret these results and determine the best technique without significant human intervention. Researchers have(More)
Algebraic topology is at a point of inflection today, and the Spring 2014 Algebraic Topology program at MSRI reflects the excitement of this moment. The introductory workshop, with more than 200 participants, provided careful introductions to the dominant themes leading up to this moment, giving an informative welcome to the many young researchers in the(More)