Learn More
eScience is rapidly changing the way we do research. As a result, many research labs now need non-trivial computational power. Grid and voluntary computing are well-established solutions for this need. However, not all labs can effectively benefit from these technologies. In particular, small and medium research labs (which are the majority of the labs in(More)
Large, real-world graphs are famously difficult to process efficiently. Not only they have a large memory footprint but most graph processing algorithms entail memory access patterns with poor locality, data-dependent parallelism, and a low compute-to- memory access ratio. Additionally, most real-world graphs have a low diameter and a highly heterogeneous(More)
We here discuss how to run Bag-of-Tasks applications (those parallel applications whose tasks are independent) on computational grids. Bag-of-Tasks applications are both relevant and amendable for execution on grids. However, few users currently execute their Bag-of-Tasks applications on grids. We investigate the reason for this state of affairs and(More)
The increasing scale and wealth of interconnected data, such as those accrued by social network applications, demand the design of new techniques and platforms to efficiently derive actionable knowledge from large-scale graphs. However, large real-world graphs are famously difficult to process efficiently. Not only they have a large memory footprint, but(More)
MyGrid is a complete grid solution for running Bag-of-Tasks applications (i.e. parallel applications whose tasks are independent) over whatever resources are available to the user. MyGrid middleware empowers users to interoperate with heterogeneous computational resources across geographic and administrative boundaries. Due to MyGrid's flexible(More)
The energy costs of running computer systems are a growing concern: for large data centers, recent estimates put these costs higher than the cost of hardware itself. As a consequence, energy efficiency has become a pervasive theme for designing, deploying, and operating computer systems. This paper evaluates the energy trade-offs brought by data(More)
This paper explores the ability to use Graphics Processing Units (GPUs) as co-processors to harness the inherent parallelism of batch operations in systems that require high performance. To this end we have chosen Bloom filters (space-efficient data structures that support the probabilistic representation of set membership) as the queries these data(More)
Graph processing has gained renewed attention. The increasing large scale and wealth of connected data, such as those accrued by social network applications, demand the design of new techniques and platforms to efficiently derive actionable information from large scale graphs. Hybrid systems that host processing units optimized for both fast sequential(More)
This paper evaluates the potential gains a workflow-aware storage system can bring. Two observations make us believe such storage system is crucial to efficiently support workflow-based applications: First, workflows generate irregular and application-dependent data access patterns. These patterns render existing storage systems unable to harness all(More)
In this paper we discuss the difficulties involved in the scheduling of applications on computational grids. We highlight two main sources of difficulties: 1) the size of the grid rules out the possibility of using a centralized scheduler; 2) since resources are managed by different parties, the scheduler must consider several different policies. Thus, we(More)