Learn More
We describe a scalable database cluster for the spatial analysis and annotation of high-throughput brain imaging data, initially for 3-d electron microscopy image stacks, but for time-series and multi-channel data as well. The system was designed primarily for workloads that build <i>connectomes</i>---neural connectivity maps of the brain---using the(More)
A public database system archiving a direct numerical simulation (DNS) data set of isotropic, forced turbulence is described in this paper. The data set consists of the DNS output on 1024 3 spatial points and 1024 time-samples spanning about one large-scale turnover timescale. This complete 1024 4 space-time history of turbulence is accessible to users(More)
We describe a new environment for the exploration of turbulent flows that uses a cluster of databases to store complete histories of Direct Numerical Simulation (DNS) results. This allows for spatial and temporal exploration of high-resolution data that were traditionally too large to store and too computationally expensive to produce on demand. We perform(More)
We present JAWS, a job-aware, data-driven batch scheduler that improves query throughput for data-intensive scientific database clusters. As datasets reach petabyte-scale, workloads that scan through vast amounts of data to extract features are gaining importance in the sciences. However, acute performance bottlenecks result when multiple queries execute(More)
We describe a method for evaluating computational turbulence queries, including Lagrange Polynomial interpolation, based on partial sums that allows the underlying data to be accessed in any order and in parts. We exploit these properties to stream data from disk in a single pass and concurrently evaluate batch queries. The combination of sequential I/O and(More)
A recently developed JHU public turbulence database [1, 2] provides new ways to access large datasets generated from high-performance computer simulations of turbulent flows to perform numerical experiments. The database archives 1024 4 (spatial & time) data points obtained from a pseudo-spectral direct numerical simulation (DNS) of forced isotropic(More)
We present a technique for organizing data in spatial databases with non-convex domains based on an automatic characterization using the medial-axis transform (MAT). We define a tree based on the MAT and enumerate its branches to partition space and define a linear order on the partitions. This ordering clusters data in a manner that respects the complex(More)
1. OVERVIEW We demonstrate data indexing and query processing techniques that improve the efficiency of comparing, correlating, and joining data contained in non-convex regions. We use computational geometry techniques to automatically characterize the region of space from which data are drawn, partition the region based on that characterization, and create(More)
We describe a new environment for large-scale turbulence simulations that uses a cluster of database nodes to store the complete space-time history of fluid velocities. This allows for rapid access to high resolution data that were traditionally too large to store and too computationally expensive to produce on demand.We perform the actual experimental(More)
  • 1