The benefits of prefetching for large-scale cloud-based neuroimaging analysis workflows

  title={The benefits of prefetching for large-scale cloud-based neuroimaging analysis workflows},
  author={Val{\'e}rie Hayot-Sasson and Tristan Glatard and Ariel S. Rokem},
  journal={2021 IEEE Workshop on Workflows in Support of Large-Scale Science (WORKS)},
To support the growing demands of neuroscience applications, researchers are transitioning to cloud computing for its scalable, robust and elastic infrastructure. Nevertheless, large datasets residing in object stores may result in significant data transfer overheads during workflow execution. Prefetching, a method to mitigate the cost of reading in mixed workloads, masks data transfer costs within processing time of prior tasks. We present an implementation of “Rolling Prefetch”, a Python… 

Figures and Tables from this paper

Engineering AI Tools for Systematic and Scalable Quality Assessment in Magnetic Resonance Imaging
  • Yukai Zou, Ikbeom Jang
  • Computer Science, Engineering
  • 2021
Challenges in constructing a large MRI data repository and using data downloaded from such data repositories in various aspects are described and a quality assessment pipeline is proposed, with considerations and general design principles.


Netco: Cache and I/O Management for Analytics over Disaggregated Stores
Experiments on a public cloud, with production-trace inspired workloads, show that Netco uses up to 5x less remote I/O compared to existing techniques and increases the number of jobs that meet their deadlines up to 80%.
Design and evaluation of a compiler algorithm for prefetching
This paper proposes a compiler algorithm to insert prefetch instructions into code that operates on dense matrices, and shows that this algorithm significantly improves the execution speed of the benchmark programs-some of the programs improve by as much as a factor of two.
Improving the Effectiveness of Burst Buffers for Big Data Processing in HPC Systems with Eley
Eley embraces interference-aware prefetching technique that makes reading data input faster while introducing low interference for HPC applications, and improves the performance of Big Data applications by up to 30% compared to existing BBs while maintaining the QoS of HPC Applications.
Software prefetching
These simulations show that, even when generated by a very simple compiler algorithm, prefetch instructions can eliminate nearly all cache misses, while causing only modest increases in data traffic between memory and cache.
Dipy, a library for the analysis of diffusion MRI data
Dipy aims to provide transparent implementations for all the different steps of dMRI analysis with a uniform programming interface, and has implemented classical signal reconstruction techniques, such as the diffusion tensor model and deterministic fiber tractography.
Recognition of white matter bundles using local and global streamline-based registration and clustering
The purpose of the proposed method, named RecoBundles, is to segment white matter bundles and make virtual dissection easier to perform and robust and adaptive to incomplete data and bundles with missing components.
Evaluating the reliability of human brain white matter tractometry
The overall approach taken here both demonstrates the specific trustworthiness of tractometry analysis and outlines what researchers can do to demonstrate the reliability of computational analysis pipelines in neuroimaging.
Probabilistic streamline q-ball tractography using the residual bootstrap
The proposed residual bootstrap method utilizes a spherical harmonic representation for high angular resolution diffusion imaging (HARDI) data in order to estimate the uncertainty in multimodal q-ball reconstructions.
Network neuroscience
This work reviews emerging trends in network neuroscience and attempts to chart a path toward a better understanding of the brain as a multiscale networked system.
GPU-accelerated diffusion MRI tractography in DIPY. International Society for Magnetic Resonance in Medicine, May 2019
  • 2019