Learn More
In this paper a new method to increase parallelism in database systems is described. Use is made of the fact that for recovery reasons, we often have two values for one object in the database—the new one and the old one. Introduced and discussed in detail is a certain scheme by which readers and writers may work simultaneously on the same object. It(More)
—MapReduce has emerged as a popular tool for distributed and scalable processing of massive data sets and is increasingly being used in e-science applications. Unfortunately, the performance of MapReduce systems strongly depends on an even data distribution, while scientific data sets are often highly skewed. The resulting load imbalance, which raises the(More)
MapReduce systems have become popular for processing large data sets and are increasingly being used in e-science applications. In contrast to simple application scenarios like word count, e-science applications involve complex computations which pose new challenges to MapReduce systems. In particular, (a) the runtime complexity of the reducer task is(More)
The field of e-science currently faces many challenges. Among the most important ones are the analysis of huge volumes of scientific data and the connection of various sciences and communities, thus enabling scientists to share scientific interests, data, and research results. These issues can be addressed by processing large data volumes on-the-fly in the(More)
eScience and big data analytics applications are facing the challenge of efficiently evaluating complex queries over vast amounts of structured text data archived in network storage solutions. To analyze such data in traditional disk-based database systems, it needs to be bulk loaded, an operation whose performance largely depends on the wire speed of the(More)
Ever increasing main memory sizes and the advent of multi-core parallel processing have fostered the development of in-core databases. Even the transactional data of large enterprises can be retained in-memory on a single server. Modern in-core databases like our HyPer system achieve best-of-breed OLTP throughput that is sufficient for the lion's share of(More)