Learn More
—We present StreamMapReduce, a data processing approach that combines ideas from the popular MapReduce paradigm and recent developments in Event Stream Processing. We adopted the simple and scalable programming model of MapReduce and added continuous, low-latency data processing capabilities previously found only in Event Stream Processing systems. This(More)
—The MapReduce programming paradigm proved to be a useful approach for building highly scalable data processing systems. One important reason for its success is simplicity, including the fault tolerance mechanisms. However, this simplicity comes at a price: efficiency. MapReduce's fault tolerance scheme stores too much intermediate information on disk. This(More)
Making efficient use of modern multi-core and future many-core CPUs is a major challenge. We describe a new compiler-based platform, Prospect, that supports the parallelization of sequential applications. The underlying approach is a generalization of an existing approach to parallelize runtime checks. The basic idea is to generate two variants of the(More)
All complex Hadamard matrices in dimensions two to five are known. We use this fact to derive all inequivalent sets of mutually unbiased (MU) bases in low dimensions. We find a three-parameter family of triples of MU bases in dimension four and two inequivalent classes of MU triples in dimension five. We confirm that the complete sets of (d + 1) MU bases(More)
—Arbitrary faults such as bit flips have been often observed in commodity-hardware data centers and have disrupted large services. Benign faults, such as crashes and message omissions, are nevertheless the standard assumption in practical fault-tolerant distributed systems. Algorithms tolerant to arbitrary faults are harder to understand and more expensive(More)
By routing messages based on their content, publish/subscribe (pub/sub) systems remove the need to establish and maintain fixed communication channels. Pub/sub is a natural candidate for designing large-scale systems, composed of applications running in different domains and communicating via middleware solutions deployed on a public cloud. Such pub/sub(More)
—Detection and remediation of security incidents (e.g., attacks, compromised machines, policy violations) is an increasingly important task of system administrators. While numerous tools and techniques are available (e.g., Snort, nmap, netflow), novel attacks and low-grade events may still be hard to detect in a timely manner. In this paper, we present a(More)
Identifying real-world business communities, e.g., Energy, finance, defense, in Internet traffic is a challenging problem but would be valuable for the construction of better in-trusion detection techniques, for example. Seed-based community detection identifies a community in a graph by iteratively adding the 'closest' vertices to an initial set of(More)
Because of economic pressure, more commodity hardware with insufficient error detection is used in critical applications. Moreover , it is expected that commodity hardware is becoming less reliable because of the continuously decreasing feature size. Thus, we expect that software-implemented approaches to deal with unreliable hardware will be needed.(More)