• Publications
  • Influence
Modeling nonlinear ultrasound propagation in heterogeneous media with power law absorption using a k-space pseudospectral method.
The k-space pseudospectral method is used to reduce the number of grid points required per wavelength for accurate simulations of nonlinear ultrasound propagation through tissue realistic media, and increases the accuracy of the gradient calculation and relaxes the requirement for dense computational grids compared to conventional finite difference methods. Expand
Exploring Thread and Memory Placement on NUMA Architectures: Solaris and Linux, UltraSPARC/FirePlane and Opteron/HyperTransport
A framework for performing memory and thread placement experiments on Solaris and Linux and a simple model describing performance as a function of memory distribution is proposed and assessed for both the Opteron and UltraSPARC. Expand
X10 as a Parallel Language for Scientific Computation: Practice and Experience
This paper reports the experiences writing three codes from the chemistry/material science domain: Fast Multipole Method, Particle Mesh Ewald and Hartree-Fock entirely in X10, entirely in order to improve the language implementation and standard class libraries. Expand
The potassium channel: Structure, selectivity and diffusion
We employ the entire experimentally determined protein structure for the KcsA potassium channel from Streptomyces lividans in molecular dynamics calculations to observe hydrated channel proteinExpand
Programming the Adapteva Epiphany 64-core network-on-chip coprocessor
This paper evaluates the performance of a 64-core Epiphany system with a variety of basic compute and communication micro-benchmarks and implemented two well known application kernels, 5-point star-shaped heat stencil with a peak performance of 65.2 GFLOPS and matrix multiplication with 65.3 GFLops in single precision. Expand
Implementation and Optimization of the OpenMP Accelerator Model for the TI Keystone II Architecture
Issues and challenges encountered while migrating the matrix multiplication (GEMM) kernel, originally written only for the C6678 DSP to the ARM-DSP SoC using an early prototype of the OpenMP 4.0 accelerator model are explored. Expand
The restricted active space self-consistent-field method, implemented with a split graph unitary group approach
An MCSCF method based on a restricted active space (RAS) type wave function has been implemented. The RAS concept is an extension of the complete active space (CAS) formalism, where the activeExpand
OpenMP in the Era of Low Power Devices and Accelerators
This work proposes a new worksharing-like construct that can distribute work when executing in the context of an explicit task, a single, or a master construct, enabling us to explore new parallelization opportunities in the authors' applications. Expand
Programming the Adapteva Epiphany 64-Core Network-on-Chip Coprocessor
This paper evaluates the performance of the Epiphany system for a variety of basic compute and communication operations and explores various strategies for implementing stencil based application codes on theEpiphany system. Expand
Full-wave nonlinear ultrasound simulation on distributed clusters with applications in high-intensity focused ultrasound
The k-space pseudospectral method is used to solve a set of coupled partial differential equations equivalent to a generalised Westervelt equation and shows good strong scaling behaviour, which means large-scale simulations can be distributed across high numbers of cores on a cluster to minimise execution times with a relatively small overhead. Expand