• Publications
  • Influence
Mapping a data-flow programming model onto heterogeneous platforms
tl;dr
In this paper we explore mapping of a high-level macro data-flow programming model called Concurrent Collections (CnC) onto heterogeneous platforms in order to achieve high performance and low energy consumption while preserving the ease of use of data- flow programming. Expand
  • 39
  • 2
  • Open Access
Elastic Tasks: Unifying Task Parallelism and SPMD Parallelism with an Adaptive Runtime
tl;dr
We introduce elastic tasks, a new high-level parallel programming primitive that can be used to unify task parallelism and SPMD parallelism in a common adaptive scheduling framework. Expand
  • 6
  • 1
  • Open Access
Dynamic Task Parallelism with a GPU Work-Stealing Runtime System
tl;dr
NVIDIA’s Compute Unified Device Architecture (CUDA) enabled GPUs become accessible to mainstream programming with a work stealing runtime for dynamic task parallelism. Expand
  • 29
  • Open Access
Mapping a data-flow programming model onto heterogeneous platforms
tl;dr
In this paper we explore mapping of a high-level macro data-flow programming model called Concurrent Collections (CnC) onto heterogeneous platforms in order to achieve high performance and low energy consumption while preserving the ease of use of data- flow programming. Expand
  • 26
CnC-CUDA: Declarative Programming for GPUs
tl;dr
In this paper, we extend past work on Intel's Concurrent Collections (CnC) programming model to address the hybrid programming challenge using a model called CnC-CUDA. Expand
  • 30
  • Open Access
Polyhedral Optimizations for a Data-Flow Graph Language
tl;dr
This paper proposes a novel optimization framework for the Data-Flow Graph Language DFGL, a dependence-based notation for macro-dataflow model which can be used as an embedded domain-specific language. Expand
  • 10
  • Open Access
Heterogeneous work-stealing across CPU and DSP cores
tl;dr
We present the design and implementation of a hybrid programming model and work-stealing runtime that allows tasks to be created and executed on both the ARM and DSP, and enables seamless execution and synchronization of tasks regardless of whether they are running on the ARM or DSP. Expand
  • 9
  • Open Access
High-level execution models for multicore architectures
Mapping a Dataflow Programming Model onto Heterogeneous Architectures
tl;dr
Mapping a Dataflow Programming Model onto Heterogeneous Architectures, using the Restart and Replay policy. Expand