• Publications
  • Influence
Interconnection networks - an engineering approach
From the Publisher: Addresses the challenges and details the basic underlying concepts of interconnection networks. The book's engineering approach considers the issues that designers need to dealExpand
  • 2,141
  • 267
Neurocube: A Programmable Digital Neuromorphic Architecture with High-Density 3D Memory
This paper presents a programmable and scalable digital neuromorphic architecture based on 3D high-density memory integrated with logic tier for efficient neural computing. The proposed architectureExpand
  • 194
  • 33
Ocelot: A dynamic optimization framework for bulk-synchronous applications in heterogeneous systems
Ocelot is a dynamic compilation framework designed to map the explicitly data parallel execution model used by NVIDIA CUDA applications onto diverse multithreaded platforms. Ocelot includes a dynamicExpand
  • 225
  • 22
A Family of Fault-Tolerant Routing Protocols for Direct Multiprocessor Networks
Our goal is to reconcile the conflicting demands of performance and fault-tolerance in interprocessor communication. To this end, we propose a pipelined communication mechanism-pipelinedExpand
  • 179
  • 15
A characterization and analysis of PTX kernels
General purpose application development for GPUs (GPGPU) has recently gained momentum as a cost-effective approach for accelerating data- and compute-intensive applications. It has been driven by theExpand
  • 128
  • 10
Characterization and analysis of dynamic parallelism in unstructured GPU applications
GPUs have been proven very effective for structured applications. However, emerging data intensive applications are increasingly unstructured - irregular in their memory and control flow behaviorExpand
  • 72
  • 8
Software-Based Rerouting for Fault-Tolerant Pipelined Communication
This paper presents a software-based approach to fault-tolerant routing in networks using wormhole or virtual cut-through switching. When a message encounters a faulty output link, it is removed fromExpand
  • 50
  • 7
Keeneland: Bringing Heterogeneous GPU Computing to the Computational Science Community
The Keeneland project's goal is to develop and deploy an innovative, GPU-based high-performance computing system for the NSF computational science community.
  • 115
  • 6
Adaptive routing protocols for hypercube interconnection networks
A taxonomy for characterizing adaptive routing protocols for hypercube interconnection networks (HINs) is presented. The taxonomy is based on classes of routing decisions common to any HIN. ThisExpand
  • 163
  • 6