• Corpus ID: 7489998

Experiences Accelerating MATLAB Systems Biology Applications

  title={Experiences Accelerating MATLAB Systems Biology Applications},
  author={Lukasz G. Szafaryn and Kevin Skadron and Jeffrey J. Saucerman},
Systems biology seeks to develop an understanding of the myriad interacting components of full biological systems, often in order to help treat or prevent diseases. System biology relies heavily on computation for parameterization of systems and their simulation. Although MATLAB is a convenient programming environment of choice for most scientists, its performance suffers because of single-threaded interpreted execution model. Optimizations of code and architectures are required in order for… 

Figures and Tables from this paper

GPU computing for systems biology

This paper reviews some recent efforts in exploiting the processing power of GPUs for the simulation of biological systems and offers an emerging alternative, GPGPU, which offers the power of a small computer cluster at a cost of approximately $400.

ParaCells: A GPU Architecture for Cell-Centered Models in Computational Biology

  • You SongSiyu YangJ. Lei
  • Computer Science, Biology
    IEEE/ACM Transactions on Computational Biology and Bioinformatics
  • 2019
ParaCells was designed as a versatile architecture that connects the user logic (in C++) with NVIDIA CUDA runtime and is specific to the modeling of multi-cellular systems, which allows it to be widely applied to many biological systems through the combination of basic biological concepts.

Accelerating simulations of cardiac electrical dynamics through a multi‐GPU platform and an optimized data structure

A strategy to accelerate the computation of the diffusion term through a data‐structure and memory access pattern designed to maximize coalescent memory transactions and minimize branch divergence is proposed, achieving results approximately 1.4 times faster than a standard GPU method.

The Architecture and Evolution of CPU-GPU Systems for General Purpose Computing

This work seeks to understand state of the art GPU architectures and examine GPU design proposals to reduce performance loss caused by SIMT thread divergence, and motivate the need of new CPU design directions for CPU-GPU systems by discussing work in the area.

Automatic dataflow application tuning for heterogeneous systems

This work presents an algorithm which automatically partitions the application workspace and can adaptively change the size of databuffers and correctly balance the load, allowing developers to skip the tedious and error-prone step of manually tuning the data granularity.

An automated framework for characterizing and subsetting GPGPU workloads

An automated framework is proposed that characterizes and subsets GPGPU workloads, depending on a user-chosen set of performance metrics/counters and internally uses principal component analysis (PCA) to reduce the dimensionality of the chosen metrics and then uses hierarchical clustering to identify similarity among the workloads.

simCUDA: A C++ based CUDA simulation framework

A CUDA simulation framework (simCUDA) is developed that effectively maps the existing application written in CUDA to be executed on top of standard multi-core CPU architectures to emulate and replicate real-world behavior of a GPU.

A characterization of the Rodinia benchmark suite with comparison to contemporary CMP workloads

Recent extensions to Rodinia are presented and a detailed characterization of the Rodinia benchmarks are conducted, showing that many of the workloads in Rodinia and Parsec are complementary, capturing different aspects of certain performance metrics.

Comparing high performance techniques for the automatic generation of efficient solvers of cardiac cell models

Different techniques to automatically speed up the numerical solution of cardiac models are compared: adaptive time step method, Partial Evaluation (PE) and Lookup Tables (LUTs), and an automatic way to find and exploit code concurrency via OpenMP directives.

Toward real-time simulation of cardiac dynamics

The achievement of real-time applications without the need for supercomputers may, in the near term, facilitate the adoption of modeling-based clinical diagnostics and treatment planning, including patient-specific electrophysiological studies.



A performance study of general-purpose applications on graphics processors using CUDA

Parallel MATLAB: Doing it Right

This work discusses the approaches the projects have taken to parallelize MATLAB, and describes innovative features in some of the parallel MATLAB projects, and gives an example of what it thinks is a "right" parallel MATLab.

Scalable parallel programming with CUDA

Presents a collection of slides covering the following topics: CUDA parallel programming model; CUDA toolkit and libraries; performance optimization; and application development.

Calmodulin mediates differential sensitivity of CaMKII and calcineurin to local Ca2+ in cardiac myocytes.

Different affinities of CaM for CaMKII and CaN determine their sensitivity to local Ca signals in cardiac myocytes, as well as reversing CaM aff in favour of CaN for CaM reverses their characteristic local responses.

Speckle reducing anisotropic diffusion

This paper provides the derivation of speckle reducing anisotropic diffusion (SRAD), a diffusion method tailored to ultrasonic and radar imaging applications, and validates the new algorithm using both synthetic and real linear scan ultrasonic imagery of the carotid artery.

Using MEX-Files to Call C and Fortran Programs