Steffen Viken Valvåg

Learn More
Cog set is an efficient and generic engine for reliable storage and parallel processing of data. It supports a number of high-level programming interfaces, including a MapReduce interface compatible with Hadoop. In this paper, we evaluate Cogset’s performance as a MapReduce engine, comparing it to Hadoop. Our results show that Cog set generally(More)
The complexity of implementing large scale distributed computations has motivated new programming models. Google's MapReduce model has gained widespread use and aims to hide the complex details of data partitioning and distribution, scheduling, synchronization, and fault tolerance. However, our experiences from the enterprise search business indicate that(More)
MapReduce has become a popular paradigm for parallel data processing, both for ad-hoc schema-less processing using a simple functional interface, and as a building block for higher-level abstractions. Much subsequent work has layered additional functionality on top of MapReduce or similar infrastructures, building powerful software stacks for distributed(More)
MapReduce has become a widely employed programming model for large-scale data-intensive computations. Traditional MapReduce engines employ dynamic routing of data as a core mechanism for fault tolerance and load balancing. An alternative mechanism is static routing, which reduces the need to store temporary copies of intermediate data, but requires a(More)
Key/value databases are popular abstractions for applications that require synchronous single-key look-ups. However, such databases invariably have a random I/O access pattern, which is inefficient on traditional storage media. To maximize throughput, an alternative is to rely on asynchronous batch processing of requests. As applications evolve, changing(More)
The <italic>omni-kernel</italic> architecture is designed around pervasive monitoring and scheduling. Motivated by new requirements in virtualized environments, this architecture ensures that all resource consumption is measured, that resource consumption resulting from a scheduling decision is attributable to an activity, and that scheduling decisions are(More)
Cloud services traditionally have a centralized architecture, where all clients communicate individually with the central service, and not directly with each other. Data is primarily stored in the cloud, and computations that touch data are performed in the cloud. We present Rusta, a platform that allows cloud services to deploy in a more flexible and(More)
Cloud database services are a convenient building block for emerging mobile cloud applications. A central database can simplify application architectures by serving both as a reliable point of contact and as a repository for critical state. Meanwhile, the issues of availability and scalability can be delegated to the cloud service provider. The convenience(More)