Pietro Cicotti

Learn More
Distributed computing using PCs volunteered by the public can provide high computing capacity at low cost. However, computational results from volunteered PCs have a non-negligible error rate, so result validation is needed to ensure overall correctness. A generally applicable technique is "redundant computing", in which each computation is done on several(More)
The <i>Gordon</i> data intensive supercomputer entered service in 2012 as an allocable computing system in the NSF Extreme Science and Engineering Discovery Environment (XSEDE) program. <i>Gordon</i> has several innovative features that make it ideal for data intensive computing including: 1,024, compute nodes based on Intel's Sandy Bridge (Xeon E5)(More)
DRAM technology has several shortcomings in terms of performance, energy efficiency and scaling. Several emerging memory technologies have the potential to compensate for the limitations of DRAM when replacing or complementing DRAM in the memory sub-system. In this paper, we evaluate the impact of emerging technologies on HPC and data-intensive workloads(More)
Tarragon is an actor-based programming model and library for implementing latency tolerant asynchronous event driven simulations. It is novel in its support for meta data describing run time virtualized process structures, which may be optimized as a free-standing object. We demonstrate early results with a synthetic benchmark, and observe that Tarragon can(More)
Exascale systems will have many-core nodes, less memory capacity per core than today's systems, and a large degree of performance variability between cores. All these conditions challenge bulk synchronous SPMD models in which execution is typically synchronous and communication is based on buffers and ghost regions. We explore the design of a multithreaded(More)
Modeling workflow performance is crucial for finding optimal configuration parameters and optimizing execution times. We apply the method of surrogate-based modeling to performance tuning of MapReduce jobs. We build a surrogate model defined by a multivariate polynomial containing a variable for each parameter to be tuned. For illustrative purposes, we(More)
We present Bamboo, a custom source-to-source translator that transforms MPI C source into a data-driven form that automatically overlaps communication with available computation. Running on up to 98304 processors of NERSC's <i>Hopper</i> system, we observe that Bamboo's overlap capability speeds up MPI implementations of a 3D Jacobi iterative solver and(More)
Accurate, continuous resource monitoring and profiling are critical for enabling performance tuning and scheduling optimization. In desktop grid systems that employ sandboxing, these issues are challenging because (1) subjobs inside sandboxes are executed in a virtual computing environment and (2) the state of this virtual environment within the sandboxes(More)
In this paper, we present two variations of a general analysis algorithm for large datasets residing in distributed memory systems. Both variations avoid the need to move data among nodes because they extract relevant data properties locally and concurrently and transform the analysis problem (e.g., clustering or classification) into a search for property(More)
We present a scalable and accurate method for classifying protein-ligand binding geometries in molecular docking. Our method is a three-step process: the first step encodes the geometry of a three-dimensional (3D) ligand conformation into a single 3D point in the space; the second step builds an octree by assigning an octant identifier to every single point(More)